HP bh5700 ATCA 14-Slot Blade Server Ethernet Switch Blade First Edition Manufacturing Part Number: AD171-9603A June 2006
Ethernet Switch Blade User's Guide release 3.2.
Legal Notices The information in this document is subject to change without notice. Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Hewlett- Packard shall not be held liable for errors contained herein or direct, indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material. Restricted Rights Legend.
About the Ethernet Switch Blade Manual This manual includes everything you need to begin using the HP Ethernet Switch Blade with OpenArchitect software, Release 3.2.2j. Ethernet Switch Blade User's Guide release 3.2.
Table of Contents Chapter 1 Overview of the Ethernet Switch Blade ...........................................................17 High Performance Embedded Switching...................................................................... 17 Advanced TCA® Compliant.........................................................................................17 OpenArchitect Switch Management............................................................................. 18 Extensible Customization of Routing Policies....
Rapid Spanning Tree................................................................................................ 50 To Enable Rapid Spanning Tree:.........................................................................51 Port Path Cost...................................................................................................... 51 Layer 3 Switch Configuration............................................................................. 52 Using the S50layer3 Script................................
Chapter 5 Fabric Switch Administration........................................................................... 73 Setting the Root Password............................................................................................ 73 Adding Additional Users...............................................................................................73 Setting up a Default Route............................................................................................ 74 Name Service Resolution........
Example Configuration Scripts............................................................................92 Overview of OpenArchitect VLAN Interfaces....................................................93 Tagging and Untagging VLANs..........................................................................94 Switch Port Interfaces..........................................................................................94 Layer 2 Switch Configuration.............................................................
Classical Targets................................................................................................111 ZNYX Targets................................................................................................... 112 ZACTION Examples......................................................................................... 112 Extensions to the default matches......................................................................113 tc: Traffic Control...........................................
SNMP and OpenArchitect Interface Definitions............................................... 134 ifStackTable Entries...........................................................................................135 SNMP Configuration......................................................................................... 135 SNMP Applications........................................................................................... 136 Port Mirroring..............................................................
Booting the Duplicate Flash Image ............................................................................159 Chapter 13 Network Configuration Problems ............................................................... 160 Interface Overview......................................................................................................160 Physical Interfaces..................................................................................................160 Default Base Interface Configuration.......
Chapter 17 Restoring the Factory Default Configuration................................................188 Chapter 18 Before Calling Support..................................................................................189 Appendix A Fabric Switch Command Man Pages........................................................ 191 vrrpconfig ...................................................................................................................192 vrrpd ................................................
zgr................................................................................................................................297 zgvrpd..........................................................................................................................300 zl2d..............................................................................................................................302 zl3d...........................................................................................................
Figure 6.3: Init Script Flow................................................................................................86 Figure 7.1: Multiple VLANs..............................................................................................94 Figure 7.2: Layer 2 Switch ................................................................................................95 Figure 7.3: Layer 3 Switch ................................................................................................99 Figure 7.
Ethernet Switch Blade User's Guide release 3.2.
Ethernet Switch Blade User's Guide release 3.2.
Chapter 1 Overview of the Ethernet Switch Blade The Ethernet Switch Blade is a 72-port AdvancedTCA® Hub and providing Gigabit Ethernet. Up to 14 ATCA node boards may be addressed via the PICMG 3.0 Base Interface and via the ATCA PICMG 3.1 fabric . The Base and Fabric switching domains are kept totally separate, both on the physical layer and the software layer. The Ethernet Switch Blade provides a tightly integrated modular switching platform that enables high-density solutions.
OpenArchitect Switch Management The OpenArchitect software component – open source Linux, IP protocol stack, control applications and the OA Engine – runs on two embedded PowerPC microprocessors. OpenArchitect provides extensive managed IP routing protocols and other open standards for switch management.
Ethernet Switch Blade Port Configuration Base switch Quick Reference ShelfManager1 zre22 ShelfManager2 zre13 ISL channel ( Base node2 ) zre23 Base nodes 3-14 zre0-11 Base nodes 15,16 zre 20-21 Front panel zre12, zre14, zre15 Fabric Switch Quick Reference slot zre numbers 3 zre0-3 4 zre4-7 5 zre8-11 6 zre12-15 7 zre16-19 8 zre24-27 9 zre28-29 10 zre30-31 11 zre32-33 12 zre34-35 13 zre36-37 14 zre38, zre39 15 zre40-41 16 zre42-43 Inter-switch Link (ISL) zre51 Front
You will find the Ethernet Switch Blade has a straightforward installation and configuration. UNIX or Linux system management skills and some understanding of network protocols will be required. Configure the Ethernet Switch Blades to your networking application before you begin using the OpenArchitect switch.
network-enabled Linux implementation. The purpose of the routing table is to tell the packet forwarding software where to forward the data packets. In Linux, the packet-forwarding algorithm is operated in software. Normally, the routing tables are maintained by operator configuration and the various routing protocols that run in the application environment of Linux. OpenArchitect uses an innovative new approach for forwarding packets.
Linux Application Environment OpenArchitect Application Level Software (i.e., zconfig, zl3d, zl2d, zsync) OpenArchitect Libraries zlxlib and ztlib Linux Application Level Software (routed, gated) Linux Kernel Linux Protocol Stack ZNYX RAIN Mgt API RMAPI Linux Routing Tables Open Architect Driver PCI Bus Switch Fabric Figure 1.2: OpenArchitect Software Structure OpenArchitect applications are used to program and configure the Ethernet Switch Blade.
Chapter 2 Port Cabling and LED Indicators The PICMG 3.1 standard defines an embedded Ethernet environment for Telco chassis. This environment includes two switch fabric slots that create a dual star Ethernet network to the fourteen node slots. Placing the Ethernet Switch Blade in a hub slot provides embedded Ethernet services to each node card across the Packet Switching Backplane of the chassis.
4. Reinsert the switch into the shelf chassis and power up. Use a terminal emulation program to access the switch console. Out of Band Ports (OOB Ports) Each switch, fabric and Base, in a Ethernet Switch Blade unit has out-of-band (OOB) Ethernet ports on the front panel. This is an alternative maintenance port supplying Ethernet connectivity instead of serial connectivity and is connected only when performing switch maintenance activities. Use ifconfig to bring up and configure the OOB ports.
Figure 2.1: LED Reference Ethernet Switch Blade User's Guide release 3.2.
Ethernet Switch Blade User's Guide release 3.2.
Chapter 3 High Availability Networking High availability networking is achieved by eliminating any single point of failure through redundant connectivity: Redundant cables, switches and network interfaces for hardware, combined with HA software solutions on both the hosts and switches to control the HA hardware and maintain connectivity. An HA solution called Surviving Partner is provided on the switch. For host-side HA, the most common solution is to use the Linux bonding driver.
VRRP Since most end nodes use default router addresses, the change of the default router address during a switch failover would require the end nodes to reconfigure. Layer 3 switches that failover must maintain the default router address to maintain the end node's IP transparent failover. The Virtual Router Redundancy Protocol (VRRP, RFC 2338) running in the Surviving Partner switches provides transparent movement of the default router address.
Switch Replacement and Reconfiguration When a switch fails, it must be replaced. The replacement switch will likely require proper configuration. For transparent switch replacement, the newly replaced switch must learn its configuration from its Surviving Partner. In a simple failover scenario, Host A and Host B are configured with failover between two host ports, one port connected to Switch A and the other connected to Switch B. Assume Switch A provides connectivity between Host A and Host B.
The configuration and runtime scripts created are as follows: • S70Surviving_partner Switch initialization script that is run at boot time. This script will restart the switch with the original configuration given to zspconfig. Optionally, zspconfig will run this script from the initial invocation. • zsp.conf. - zspconfig configuration file that contains the configuration of the sibling backup switches. The is used to distinguish potentially more than one backup switch.
When using a Linux Bonding driver on the node card, the bonding driver should be configured for Mode 1 (active/standby). See the Linux Bonding documentation at http://sourceforge.net/projects/bonding/ for complete information. The two Base switches will be configured as Surviving Partners, using VRRP to form a single virtual interface to the hosts, as will the two Fabric switches. The ports can be configured many different ways, with blocks of ports configured as vans.
sibling_addresses: zhp1 = 10.0.0.30, 10.0.0.31 netmask 255.0.0.0; Now configure the virtual address for each sibling group. We are going to create a virtual interface across one VLAN, but not for the interconnect. This provides a single point to connect/route to the VLANs. vrrp_virtual_address: zhp1 = 10.0.0.42 netmask 255.0.0.0; Next come port definitions, as defined on the zspconfig man page.
#vrrp_mode: block_crossconnect; The next sections determines the failover mode between the Surviving Partner switches. There are three modes: • switch - Failover by switch. Failover from Master switch to Backup on any port failure. The switch with the most links becomes the new Master. One port failure will cause the switch to failover. • vlan - Failover by VLAN. The switch with the most up links in the VLAN becomes the Master of that VLAN.
#start_script:/etc/rcZ.d/SxxScript; #start_script:/etc/rcZ.d/SyyScript; # vrrpd_script: during Allows the user to add scripts to be executed # vrrpd state transitions. of the These scripts are run from the end # /etc/rcZ.d/surviving_partner/vrrpd.script file. provided # script must be well behaved. delays it will If it crashes, or hangs or # effect the SurvivingPartner performance. run in # backround. itself.
Once the configuration files are complete, run the zspconfig utility on the Master to configure all the scripts: NOTE: This command can take 60 seconds or more with no screen output. zspconfig –f zsp.conf You will see output similar to this: zspconfig -f zsp.conf ….
# This script will likely need modification for your particular # network setup. # # In this example the Egress ports, zre20..23 and zre48..50 are # not managed by HA since how, or if, these ports are managed by HA is # dependent on the external devices they are connected to. Non-HA # egress ports can be brought up through conventional means by adding # an S-script to /etc/rcZ.d.
zconfig zre0, zre4, zre8, zre12, zre16, zre24, zre28, zre30, zre32, zre34, zre36, zre38, zre40, zre42 = untag1; zconfig zre1, zre5, zre9, zre13, zre17, zre25, zre29, zre31, zre33, zre35, zre37, zre39, zre41, zre43 = untag2; zconfig zre2, zre6, zre10, zre14, zre18, zre26 = untag3; zconfig zre3, zre7, zre11, zre15, zre19, zre27 = untag4; zconfig zre51 = untag100; # Recommend using vrrp_mode RAINlink_xmit_on_failover. zl3d zhp1 zhp2 zhp3 zhp4; # First address is our address.
vrrp_virtual_address: zhp1 = 10.0.0.42 netmask 255.0.0.0; vrrp_virtual_address: zhp2 = 11.0.0.42 netmask 255.0.0.0; vrrp_virtual_address: zhp3 = 12.0.0.42 netmask 255.0.0.0; vrrp_virtual_address: zhp4 = 13.0.0.42 netmask 255.0.0.0; # Port definitions # Define to what the ports are connected. be # by zhp or zre name. the Specifications can The zhp name is a shortcut to specify # entire port group associated with that interface. In the end # these definitions are on a port by port basis.
# crossconnect ports of the VRRP Backup. block_crossconnect mode is The # meant as a replacement for STP, however, the switches connected to the # crossconnect ports must be Ethernet Switch Surviving Partner. switches running # # The RAINlink_xmit_on_failover mode requires that the OpenNode blades # connected to RAINlink ports transmit a packet when failing over, so that # The Layer 2 tables will learn the new port/MACaddress relationship.
failover_mode: port; # VRRP_msg_rate is the time in milliseconds between transmissions # VRRP messages on the interconnect. requires the The VRRP protocol # absence of 3 VRRP messages before concluding that the remote switch # has failed. siblings. The msg_rate must match the msg_rate of all # Anything other than multiples of seconds is non-conformant # with the VRRP specification and will only run with ZNYX supplied # vrrpd.
# Fabric portions of the 7100 switch. is dependent on the The actual coordination # setting of the board_synchronization_mode and the failover_mode. In # switch failover_mode the number of up links in both switch planes is # considered. all In vlan and port failover mode they are not. In # failover_modes, if the data plane or fabric plane switch reboots or # power cycles, the HA partner will take mastership for all VLANs in # both planes. "basic" is the board_synchronization is off by default.
# gated_template: the Allows the user to provide a template for # gated.conf file to be used by the sibling group. #gated_template: /etc/rcZ.d/surviving_partner/gated.template Once the configuration files are complete, run the zspconfig utility on the Master to configure all the scripts: NOTE: This command can take 60 seconds or more with no screen output. zspconfig –f zsp.conf You will see output similar to this: zspconfig -f zsp.conf ….
Finally, it lets the currently saved S70Surviving_Partner script execute. This case would be the case of a power up of an already configured backup switch when the other HA switch is unavailable. This case could occur after losing power to the entire chassis. Central Authority Modifications can be made to the S60SP_startup script to use a third machine running DHCP that is not part of the Surviving Partner pair. The third machine is referred to as the Central Authority.
"zsp.primary.conf"; } host SECONDARY { fixed-address 100.0.0.31; "SECONDARY"; option dhcp-client-identifier option vendor-encapsulated-options "zsp.secondary.conf"; } } The zsp.primary.conf and zsp.secondary.conf files must be placed in the tftp location on the machine, often /tftpboot. The zsp.primary.conf and zsp.secondary.conf files can be retrieved from the Surviving Partner switches. This is the configuration that will be given to the switches. It is recommended that the zsp.
request vendor-encapsulated-options; require vendor-encapsulated-options; The last step is to modify the startup scripts that run zspconfig to use the -c option. The -c option allows you to provide a dhclient.conf script rather then having zspconfig create a default. For example, the S60SP_startup script line that reads: echo y n | zspconfig -t 10 -su zhp0 > /dev/null 2>&1 Can be modified to echo y n | zspconfig -c /etc/rcZ.d/surviving_partner/dhclient.new.
Chapter 4 Fabric Switch Configuration Two switches, two consoles There are two separate switch portions in the Ethernet Switch Blade units, the base switch and the fabric switch. The fabric switch handles the data traffic for the ATCA rack over ports 0-47. It runs the Ethernet Switch Blade software. Two or four GigE connections are provided to node cards using the ATCA backplane.
Changing the Shell Prompt You may use standard bash shell procedures to change the prompts on your base switches. Many sites choose a system that distinguishes among the individual switches at their location. The same rules apply for saving your choice (zsync) as for all other configuration changes. Default Configuration Scripts As shipped the following scripts are run from /etc/rcZ.d as the switch boots up: NOTE: These default scripts will change in later releases. Use them as examples.
Overview of OpenArchitect VLAN Interfaces A zhp device is associated with one VLAN. zhp may have one or more physical ports and their associated zre devices. A VLAN from the viewpoint of the switch is a logical mapping of ports based on intended use. The primary purpose of a VLAN is to isolate traffic and enable communication to flow more efficiently within groups of mutual interest. The switch is used to bridge from one VLAN to another. Figure 4.
Switch Port Interfaces For each switch port, OpenArchitect creates a separate interface with its own MAC address called a ZNYX raw Ethernet (zre). After the initial power up, 48 zre interfaces are created, one for each in band port. You cannot directly access or modify the zre interfaces. During the initial power up of the switch, the default configuration creates a Layer 2 switch. The Layer 2 configuration places the zre interfaces in one zhp interface. See Figure 4.
ifconfig zhp1 0.0.0.0 # # At this point the system will act as a Layer 2 switch # across all ports. Also, the system will accept telnet() # connections on 10.0.0.43 on any port. Script(s) may then # be run to reinitialize the system and modify its # configuration. Using the S50layer2 Script The S50layer2 script can be used as an example, and edited to customize your Layer2 setup. The default script may not match your physical port configuration.
To Enable Rapid Spanning Tree: Create a VLAN containing the ports that will be a part of the Linux bridge running Rapid Spanning Tree. This example will use ports 0-3 (untagged): zconfig zhp0: vlan1=zre0..3 zconfig zre0..3=untag1 Create a bridge device from the zhp device, zl2d start zhp0 A Bridge device named bzhp0 should now exist consisting of ports zre0 through zre3 with Spanning Tree enabled.
Layer 3 Switch Configuration The previous section outlines the Layer 2 switch configuration that is automatically configured when you initially bring up the OpenArchitect switch. In order to communicate between Layer2 interfaces, you must properly setup routing. The steps to build a Layer 2 switch involve creating a group of switch ports in a VLAN (or Layer 2 switching domain) and bringing that interface up. zconfig creates the VLAN group of switch ports as well as a network interface.
In the S50layer3 script separate VLANs are set up for each port. The VLANs, are labeled as zhp0..zhpn. Each VLAN is associated with an individual zre interface. There is always a one to one connection between VLANs and zhp interfaces. Remember, zre and zhp interfaces can begin with a zero value but a VLAN cannot (that is, zhp0 has zre0 on vlan1, zhp1 has zre1 on vlan2). Each zhp interface is assigned a separate IP address in the example script.
the number of IP addresses as applicable. In the example below, the IP address is changed for the interface in the ifconfig command line of the script. From: ifconfig zhp0 10.0.0.43 netmask 255.255.255.0 broadcast 10.0.0.255 up To: ifconfig zhp0 193.08.1.1 netmask 255.255.255.0 broadcast 193.08.1.255 up • • • • Adjust the number of zhp interfaces, that are added to the routing tables, depending on the number of VLANs you are adding for your network. Include any other details, as applicable.
interface 10.0.1.42 passive interface 10.0.2.42 passive . . . interface 10.0.13.42 passive interface 10.0.14.42 passive interface 10.0.15.42 passive • Defines the netmask used in the interface. define 10.0.0.43 netmask 255.255.255.0; define 10.0.1.42 netmask 255.255.255.0; define 10.0.2.42 netmask 255.255.255.0; . . . define 10.0.13.42 netmask 255.255.255.0; define 10.0.14.42 netmask 255.255.255.0; define 10.0.15.42 netmask 255.255.255.0; • Sets the RIP1 protocol to open.
. . interface 10.0.13.43 ripin ripout version 1; interface 10.0.14.43 ripin ripout version 1; interface 10.0.15.43 ripin ripout version 1; • Imports routes learned through the RIP protocol. import proto rip { all; }; • Exports all directly connected routes and routes learned from the RIP protocol. export proto rip { proto direct } all; }; proto rip { all; }; To Modify the GateD Scripts: Copy two GateD files, the OpenArchitect "S" file and its corresponding conf file, into the rcZ.
Or for OSPF: cp /etc/rcZ.d/examples/S55gatedOspf /etc/rcZ.d cp /etc/rcZ.d/examples/gated.conf.ospf /etc/rcZ.d Open and make configuration changes to the listed conf file to coincide with the current Layer 3 configuration (that is, adjust IP addresses and number of interfaces available). See GateD documentation if you have questions regarding the conf file. • Run the OpenArchitect zsync command to save your changes. Be sure your changes are correct: Zsync • • Reboot the switch.
Marking and Re-marking The OpenArchitect switch can mark or remark packets using the TOS field or 802.1p tag. This is also controlled through the Linux iptables utility. Scheduling The servicing of configured queues by the switching fabric is referred to as scheduling. The OpenArchitect switch has three built-in scheduling algorithms. The type of scheduling algorithm used is implied, rather than being explicitly specified, based on the number of queues and which options are configured.
you may want to move your set of iptables commands to a start up script to run upon initialization. This could be accomplished by creating a standalone "S" script and placing that script into / etc/rcZ.d. Restrictions on Implementation Several restrictions exist on the rules that can be implemented on the FFP hardware. These include: Actions DROP the packet. ACCEPT the packet.
On the other hand, in the following sequence of rules, the position of the rule that drops SYN packets is important. Since the set of fields it examines is not a subset of the fields examined by the ACCEPT rules, and visa versa, the ordering rule given above does not apply. In this case, the order it is applied will be the same as its position in the FORWARD chain, and all packets which are TCP SYN packets from zhp5 for zhp3 will be DROPPED, even if they also match one of the ACCEPT rules.
By default, INPUT, FORWARD and OUTPUT chains are installed on boot up. Additional rules can be installed for the other chains. Additionally, one can write software extensions to add more chains. Figure 4.2 provides an illustration of the Firewall Flow. In c o m in g P re ro u te In p u t R o u tin g D e c is io n F o rw a rd L o c a l P ro c e s s P o s t R o u te O u tg o in g O u tp u t Figure 4.
send to CPU action is specified, it is sent to the INPUT chain for further processing. If there is no valid way to forward the packet, it is dropped. If the switch is configured to forward the packet, it is sent to the FORWARD chain. Next the hardware FORWARD chain is walked. If there is a rule inserted that matches the packet headers, then it is looked up next. The inserted policy will decide the packets fate. In essence, a filter rule will be used to scan the packet data for certain characteristics.
The type can be preceded by ! to match any message except the type listed, for example, -icmp-type ! 1 Specifying TCP or UDP ports If the protocol is TCP or UDP, the -s ( or --sport) and -d (or --dport) options specify the TCP or UDP ports to match. A range of ports can be specified by giving the first and last ports separated by a :, as in -dport 0:1023. It is also possible to precede the port specification with a ! to match all ports which are not included in the range, for example, --sport ! 0:1023.
--drop --accept Drops the packet Accepts the packet --set-prio Set the 802.1p priority to --use-prio Use queue priority --copy-cpu Send the packet to the CPU. installed chains traversal in software --set-eport This will force the full Redirect the packet to port --set-mport Mirror the packet to port --set-tos Set the IP-Precedence bits in the TOS field of the IP header to --set-dscp Set the 6-bit DSCP header to .
FORWARDING Chain supports all of them. tc and zqosd tc, which stands for Traffic Control, is a mechanism for enabling Quality of Service on Linux. tc uses three functional objects: queuing disciplines, which comprise queuing and scheduling algorithms such as FIFO queues, priority queues, RED queues, and token buckets; classes, which are leafs in queuing discipline hierarchies; and filters, such as u32 filters and route filters.
qdisc pfifo 100: dev zhp0 limit 32p The tc command is applied to a device, so dev zhp0 must be specified. Note that a VLAN, such as zhp0, and a port, such as zre0, are each treated as devices. Breakdown of the options: handle 100:0 Defines the handle for the queuing discipline. This handle may be used to reference the pfifo queue. Note that the handle is included with the output of the qdisc ls command. (100:0 and 100: are equivalent in tc.) The choice of handle is significant for zqosd.
The byte-limited FIFO queue case differs only slightly from the packet-limited FIFO case. The syntax is almost identical. In hardware the limit is based on 128-byte cells. The specified byte limit is divided by 128 to determine the cell limit. Always specify a byte limit of at least 128 bytes to avoid setting the queue length to zero.
index of the list element (numbering from 0) and q is the value specified by that element. So, this example would read: Priority 0 maps to Queue 1 Priority 1 maps to Queue 2 Priority 2 maps to Queue 2 Priority 3 maps to Queue 2 Priority 4 maps to Queue 3 Note that the tc priority map applies to a 4-bit field. With the Ethernet Switch Blade, the priority map refers to the 802.1p tag, which is a 3-bit field.
The U32 Filter The U32 filter provides the capability to match on fields in the L2, L3 or L4 header of a packet. Each match rule gives the location of the field to be tested, which is always a 32 bit word, a mask selecting the bits to be tested, and a value which is to be matched by the packet field. Many matches can be specified in one tc filter command. Only if all matches succeed does the filter match. In that case, the flowid field identifies the classid of the class this packet belongs in.
Although the translation rules handle some inconsistency between software and hardware, a user must define a combination of rules that is reasonable in hardware, to ensure predictable results. Handle Semantics All examples have illustrated zqosd copying tc rules into hardware. In fact, the zqosd utility also enables the user to add tc rules that remain only in software. This selection is based on handles. zqosd processes all supported queue disciplines and filters with handles between 100:0 and 200:FFFF.
• The PDP sends that policy to the PEP. • The PEP installs the policy and applies it to future traffic. As long as COPS is running, a connection between the PEP and PDP should stay open. A PEP could query a PDP at any time asking for a policy decision. Alternatively, an administrator could modify the policy on a PDP, which would then push any policy changes to its PEPs. Protocol Architecture The COPS protocol is broken into several components.
The pepd utility requires a PDP that has implemented the above RFCs and drafts. Until all draft standards are approved, the certain COPS-PR data types will not be assigned OIDs. pepd uses non-standard OIDs for the unassigned values. Using pepd The pepd utility works by connection to a PDP, informing the PDP of its roles, and installing any rules that the PDP has for those roles. Configuration information should be specified in a configuration file, specified on the command line with the –f option.
Chapter 5 Fabric Switch Administration One of the main benefits of the OpenArchitect switch is that it runs Linux, so much of the switch administration is already familiar to most network or system administrators. It is a good idea to complement these instructions with a standard Linux reference guide, such as Linux Network Administrator’s Guide available from O’Reilly. Below are brief descriptions of some of the more routine administrative task pertinent to the switch.
Enter new password: Re-enter new password: Password changed. ZX7100-OA# zsync ZX7100-OA# Setting up a Default Route If you wish to access the switch from some place other than a directly attached network, you may want to setup a default route. Use the route command to set a default gateway. route add default gw 10.0.0.254 Put the entry into the /etc/init.d/rcS startup script to automatically set a default route upon reboot.
dhcpd Consult Linux Network administration manuals for more information on DHCP and configuration options. To use DHCP to set your IP addresses automatically on boot up, uncomment the the following line in /etc/init.d/rcS by removing the # sign dhcpd Network Time Protocol (NTP) Client Configuration NTP is a protocol for setting the real time clock on a system. There are numerous primary and secondary servers available on the network.
/sbin/rpc.statd /usr/sbin/rpc.mountd -r Once the above servers are started, you can mount a remote NFS file system. mount rhost:nfs_file_system local_mount_point If the remote NFS file system you’re mounting is on an OA switch, you should mount with caching disabled. mount rhost:nfs_file_system –o noac local_mount_point All the necessary servers are included in /etc/init.d/rcS but are commented out by default.
Now start nfsd to export the mount points and begin answering requests from remote clients. /sbin/rpc.nfsd –r To export file systems automatically on boot, edit /etc/init.d/rcS, uncomment the /sbin/rpc.nfsd command line by removing the #. /sbin/rpc.nfsd -r Connecting to the Switch Using FTP Use ftp to transfer files to or from the switch. See the Linux Reference Guide for details of the ftp command.
SNMP Agent Simple Network Management Protocol (SNMP) is the defacto standard for network management. An SNMP agent maintains a structure of data for a network device in a virtual information database, called a Management Information Base (MIB). A network management station is capable of accessing the MIB of the network device to monitor and configure the network device. The OpenArchitect switch utilizes the NET-SNMP (formerly UCD-SNMP) agent core.
Supported MIBs RFC 2573: SNMP Applications RFC 2574: User-based Security Model (USM) for version 3 of the Simple Network Management Protocol (SNMPv3) RFC 2575: View-based Security Model (VACM) for version 3 of the Simple Network Management Protocol (SNMP) RFC 2576: Coexistence between Version 1, Version 2 and Version 3 of the Internetstandard Network Management Framework RFC 2665: Definitions of Managed Objects for Ethernet-like Interfaces RFC 2674: Definitions of Managed Objects for Bridges with
Supported Traps SNMPv2-MIB: coldStart SNMPv2-MIB: authenticationFailure IF-MIB: linkUp IF-MIB: linkDown UCD-SNMP-MIB: ucdShutdown RMON-MIB: risingAlarm RMON-MIB: fallingAlarm VRRP: vrrpTrapNewMaster VRRP: vrrpTrapAuthFailure EGP (rfc1213): egpNeighborLoss BGP4-MIB: bgpEstablished BGP4-MIB: bgpBackwardTransition Table 5.
Link and SNMP Status Physical Link Status SNMP Operational Status zre1 zre2 zre1 zre2 zhp0 down down down down down down up down up up up down up down up up up up up up Table 5.3: Link and SNMP Status The administrative status is directly controlled by ifconfig up/down. The administrative status of the zhps and zres do not affect each other.
response. The processing for link up and link down traps is now user configurable. As the default, traps conform to RFC2863, meaning the trap contents will include: ifIndex, ifAdminStatus and ifOperstatus You can alter this behavior by specifying: cisco_link_traps on If cisco_link_traps are turned on as described then link up and link down traps will have a cisco-like format and the trap contents will include: ifDescr and ifType Examine and edit /usr/share/snmp/snmpd.
mirrored (copied and transmitted) to port 12. This mirroring would be in addition to any Layer 3 or Layer 2 switching. zmirror zre0 zre12 zmirror zre1 zre12 zmirror zre2 zre12 To clear the current mirroring use the -t option. The -e option can be used to indicate that packets being sent on a given port should be copied to the mirror_to port. For example if the -e option is used as follows, the packets transmitted, as opposed to received, on ports 0, 1 or 2 would be mirrored to port 12.
Chapter 6 Fabric Switch Maintenance This chapter includes basic information about the OpenArchitect switch environment including an overview of the file system structure, modifying and updating switch files, upgrading the switch driver and kernel, and implementing a system recovery. Overview of the OpenArchitect switch boot process The OpenArchitect switch is equipped with a Random Access Memory (RAM) disk and three Read-Only Memory (ROM) devices, including, a boot ROM and two application flash devices.
Bootloader examines the bootstring in the boot ROM Determines if the boot string is dev1 Yes Loads image from Flash 1 to RAM Yes Loads image from Flash 2 to RAM No Determines if the boot String is dev2 No Begins execution of RAM image Boot into zmon bootloader Figure 6.2: Boot Flow Chart Under normal circumstances, the booting up process follows the process outlined in Figure 6.2.
/etc/init.d/rcS /etc/rcZ.d/rc S* S* S* Figure 6.3: Init Script Flow Saving Changes Any modifications made to the scripts for your particular configuration must be properly saved or your changes are lost when you reboot. The file system for the switch only exists in memory. A rewritable overlay is contained within the upper four megabytes of the first application flash.
configuration files contained in /e t c / r c Z . d In order to telnet into the box, there must be a configured interface with a proper IP address. For example, zhp0 is configured with the IP address 10.0.0.43 in the factory default configuration. Booting with the –i option If you cannot telnet into the switch and Linux fails to boot, it is likely that a change saved by zsync has left the switch in an inaccessible state.
zsync /etc/hosts • Reboot the system. System Hangs During Boot After attaching the system console cable, if the system hangs during boot, try booting with the –i option as described in the previous section. It is possible that important Linux system files became corrupted and incorrectly saved in the flash overlay. Use zmnt as described in the previous section to fix or remove the problem files from the overlay.
Download the OpenArchitect image to a local system. The OpenArchitect image is very close to the limit of free space available on a default system so you may need to clear some space prior to downloading the OpenArchitect image to the switch. Check for free space with the df command. One of the easiest ways to create free space is to remove /usr/sbin/gated. The application will be replaced during the update procedure. Once you have enough free space, proceed.
Using apt-get apt-get is a utility created by the Debian Linux community to allow remote fetching and installation of software stored in a repository in Debian package format. It allows users to keep their software up-to-date with the latest binaries, and install new software without the need to recompile. Users may create their own repositories and add entries in /etc/apt/sources.list ( empty by default ) for their private access methods to their private repository. See http://www.debian.
Chapter 7 Base Switch Configuration At this point, the OpenArchitect Ethernet Switch Blade should be installed and powered up for the first time. This chapter helps you connect and configure the base switch by presenting command line examples as well as a discussion of the example configuration scripts. You may configure the fabric switch independently from the base switch. Two switches, two consoles There are two separate switches in the Ethernet Switch Blade.
files into flash for reloading. Changing the Shell Prompt You may use standard bash shell procedures to change the prompts on your base switches. Many sites choose a system that distinguishes among the individual switches at their location. The same rules apply for saving your choice (zsync) as for all other configuration changes. Default Configuration Scripts As shipped the following scripts are run from /etc/rcZ.
• S50multivlan - Script which sets up multiple untagged VLANs. The first VLAN includes the first ten 10/100/1000 ports, the next contains the last ten 10/100/1000 ports, the third VLAN contains two 10/100/1000 ports, the last VLAN contains the last two 10/100/1000 ports. Layer 3 switching is enabled. • S55gatedRip1 - Script which is used with a Layer 3 switch and calls the GateD daemon to enable RIP 1 routing protocol.
Figure 7.1: Multiple VLANs Tagging and Untagging VLANs The OpenArchitect switch is capable of switching VLAN tagged and untagged data packets. VLAN tagged packets conform to the 802.1q specification and the packet header contains an additional four bytes of VLAN tag information. A given port can be specified to accept VLAN tagged or untagged traffic. Internally, all traffic for a particular VLAN is treated as tagged traffic.
Linux IP zhp0 10.0.0.42 VLAN 1 zre0 zre1 zre2 ... ... zre20 zre22 zre23 24 10/100/1000 Ports Figure 7.2: Layer 2 Switch During the initial power up, a startup script called /etc/rcZ.d/S50layer2 is executed at boot time creating a single untagged VLAN (IP interface labeled as zhp0) which includes all Ethernet and gigabit ports as one Layer2 switch. The interface to the host is then assigned the IP address of 10.0.0.42 to allow access to the switch.
Using the S50layer2 Script The S50layer2 script can be used and example, or edited to customize your Layer2 setup. For example, to reconfigure the IP address on your Layer 2 switch, • Open the S50Layer2 file in the Linux vi editor. • Change the IP address value listed under the Linux ifconfig(1M) command line. • Save your changes by running OpenArchitect zsync. • Reboot the switch.
brctl show brctl showbr bzhp0 Port Path Cost Each port has an associated cost that contributes to the total cost of the path to the Root Bridge when the port is the root port. The smaller the cost, the better the path. The Ethernet Switch Blade uses the following IEEE 802.1D recommendations based on the connection speed of your port: Port Path Cost Link Speed Recommended Value Recommended Range 10 Mb/s 100 50-600 100 Mb/s 19 10-60 1 Gb/s 4 3-10 Table 7.
zconfig zhp1: vlan2=zre5..8 zconfig zre5..8=untag2 Now, use ifconfig to assign each zhp interface an IP address, ifconfig zhp0 10.0.0.1 ifconfig zhp1 11.0.0.1 At this point, the Linux host has enough information to route between the networks of the directly attached interfaces, 10.0.0.0 via zhp0, and 11.0.0.0 via zhp1. The next step is to enable the ZNYX zl3d daemon to move that routing information from the host to the base switch switching tables in silicon.
Linux IP zhp0 - zhp23 VLAN 2 zre1 VLAN 4 VLAN 6 zre3 VLAN 1 VLAN 3 zre0 zre2 zre5 VLAN 8 VLAN 10 VLAN 12 VLAN 14 zre7 VLAN 5 VLAN 7 zre4 zre6 zre9 zre11 zre13 VLAN16 zre15 VLAN18 VLAN20 VLAN22 VLAN24 zre17 VLAN 9 VLAN 11 VLAN 13 VLAN 15 VLAN17 zre8 zre10 zre12 zre14 zre16 zre19 VLAN19 zre18 zre21 zre23 VLAN21 VLAN23 zre20 zre22 Each vlan interface (zhp) has only one switch port (zre) Figure 7.
Runs the OpenArchitect zl3d. The zl3d application monitors the Linux routing tables and updates the switch routing tables for each interface configured above. /usr/sbin/zl3d zhp0..23 • zl3d initially creates and adds each zhp interface (VLAN) to the switch routing tables. The zhp0..zhp23 is shorthand for the list of interfaces (zhp0, zhp1, …, zhp23) to monitor with zl3d. To Modify the Layer 3 Script • Modify the example script you copied into the /etc/rcZ.d directory.
VLAN 4, zhp3: for last set of six ports, zre18-zre23 Each VLAN interface is labeled zh p N in the file, where N is a value from 0-3. Each interface is untagged and assigned its own IP address (see Figure 7.4). • Linux IP zhp0 zhp1 zhp2 VLAN1 zhp3 VLAN3 zre1 zre3 zre5 zre13 zre15 zre17 zre0 zre2 zre4 zre12 zre14 zre16 VLAN2 VLAN4 zre7 zre9 zre11 zre6 zre8 zre10 zre19 zre21 zre23 zre18 zre20 zre22 Each VLAN (zhp) contains 6 ports (zre's) Figure 7.
(10.0.0.42-10.0.3.42), assigns the netmask and brings them up. ifconfig zhp0 10.0.0.42 netmask 255.255.255.0 broadcast 10.0.0.255 up ifconfig zhp1 10.0.1.42 netmask 255.255.255.0 broadcast 10.0.1.255 up ifconfig zhp2 10.0.2.42 netmask 255.255.255.0 broadcast 10.0.2.255 up ifconfig zhp3 10.0.3.42 netmask 255.255.255.0 broadcast 10.0.3.255 up • Runs the OpenArchitect zl3d command.
example): • Starts GateD with Rip1 using gated.conf.rip1 as the configuration file: /usr/sbin/gated –f /etc/rcZ.d/gated.conf.rip1 The GateD conf file specifies the following configuration commands: Implements the passive function so GateD is prevented from rerouting information to a different interface if insufficient information is received. interface 10.0.0.42 passive • interface 10.0.1.42 passive interface 10.0.2.42 passive . . . interface 10.0.13.42 passive interface 10.0.14.42 passive interface 10.0.
interface all noripin noripout Opens sending and receiving packets for selected interfaces. interface 10.0.0.42 ripin ripout version 1; • interface 10.0.1.42 ripin ripout version 1; interface 10.0.2.42 ripin ripout version 1; . . . interface 10.0.13.42 ripin ripout version 1; interface 10.0.14.42 ripin ripout version 1; interface 10.0.15.42 ripin ripout version 1; Imports routes learned through the RIP protocol.
cp /etc/rcZ.d/examples/gated.conf.rip1 /etc/rcZ.d Or for RIP2: cp /etc/rcZ.d/examples/S55gatedRip2 /etc/rcZ.d cp /etc/rcZ.d/examples/gated.conf.rip2 /etc/rcZ.d Or for OSPF: cp /etc/rcZ.d/examples/S55gatedOspf /etc/rcZ.d cp /etc/rcZ.d/examples/gated.conf.ospf /etc/rcZ.d • • • • Open and make configuration changes to the listed co n f file to coincide with the current Layer 3 configuration (that is, adjust IP addresses and number of interfaces available).
Marking and Re-marking The OpenArchitect switch can mark or remark packets using the TOS field or 802.1p tag. This is also controlled through the Linux iptables utility. Scheduling The servicing of configured queues by the switching fabric is referred to as scheduling. The OpenArchitect switch has three built-in scheduling algorithms. The type of scheduling algorithm used is implied, rather than being explicitly specified, based on the number of queues and which options are configured.
Running zfilterd Before starting zfilterd, ztmd must be running. Your can start both from within a script, or directly from the command line. For example, ztmd zfilterd iptables rules can be entered at any time. If your iptables filtering rules set is extensive, you may want to move your set of iptables commands to a start up script to run upon initialization. This could be accomplished by creating a standalone "S" script and placing that script into /e t c / r c Z . d .
action that will take place. For example, the rules: iptables -a FORWARD -i zhp3 -j DROP iptables -a FORWARD -i zhp3 -o zhp1 -p tcp --dport smtp -j ACCEPT result in SMTP packets received on any port in zhp3 to be sent for any port in zhp1; all other packets from zhp3 would be dropped. The order of the two rules in the FORWARD chain does not matter. On the other hand, in the following sequence of rules, the position of the rule that drops SYN packets is important.
Introduction Firewall rules are stored in tables. These tables are sometimes also known as firewall chains or just chains. Tables normally store rules for what are known as hooks, which can be looked as packet-path junctions. There are five defined hooks: PRE-ROUTE, POST-ROUTE, INPUT, OUTPUT and FORWARDING. The example below illustrates the default chains on boot up. By default, INPUT, FORWARD and OUTPUT chains are installed on boot up. Additional rules can be installed for the other chains.
Packet Walk When a packet comes in via one of the interface ports, the base switch makes a routing decision. If the packet was destined for the base switch itself or if the send to CPU action is specified, it is sent to the INPUT chain for further processing. If there is no valid way to forward the packet, it is dropped. If the switch is configured to forward the packet, it is sent to the FORWARD chain. Next the hardware FORWARD chain is walked.
--icmp-type ping The type can be preceded by ! to match any message except the type listed, for example: --icmp-type ! 1 Specifying TCP or UDP ports If the protocol is TCP or UDP, the -s ( or --sport) and -d (or --dport) options specify the TCP or UDP ports to match. A range of ports can be specified by giving the first and last ports separated by a :, as in -dport 0:1023.
ZNYX Targets ZACTION This is the ZNYX Action target. Parameters for ZACTION: --drop Drops the packet --accept Accepts the packet --set-prio Set the 802.1p priority to --use-prio Use queue priority --copy-cpu Send the packet to the CPU.
Extensions to the default matches These are described in the Linux packet filtering HOWTO at: http://netfilter.org/documentation/index.html#documentation-howto ZNYX FORWARDING Chain supports all of them. tc: Traffic Control The switch supports up to eight queues for each port, including the cpu port. These queues hold packets waiting to be transmitted for a given port.
queue number + 1 after the qdisc handle. So the highest priority queue in this example is 105:8. NOTE: 16 values must be provided for the priomap list. This is a feature of the Linux priority system, which uses 16 priority levels. The last eight values given will be ignored.
handle 100:0 Defines the handle for the queuing discipline. This handle may be used to reference the pfifo queue. Note that the handle is included with the output of the qdisc ls command. (100:0 and 100: are equivalent in tc.) The choice of handle is significant for zqosd. root Tells tc that this is the base queuing discipline for the device, not a child of another queuing discipline. pfifo limit 32 Specifies a packet-limited FIFO queue with an upper bound of 32 packets.
match ip tos 0xa0 0xe0 would match an IP precedence of 5. Specific fields can also be specified by giving their offset from the beginning of the IP header and a field name of u8, u16, or u32, depending on the width of the field. For example, to match the SYN bit in the TCP flags, the specification is: match u8 2 0x02 at 33 Several IP fields can be matched in the same filter by specifying multiple match operations. The filter will be satisfied only if all matches are true.
tc qdisc add dev zre1 ingress //ingress qdisc for zre1 tc qdisc add dev zhp2 ingress //ingress qdisc for vlan The filter add command changes slightly, the parent is now a special handle ffff:fff1, so using the same filter as the first example: tc filter add dev zre1 parent ffff:fff1 protocol ip u32 match ip dst 10.91.100.5/32 classid 105:2 This filter will match packets arriving on port zre1, destined for port zre5, with destination IP address 10.91.100.5.
omitted, and the packet is not dropped, the egress queue will be determined by the priority of the packet, either from the 802.1p priority for tagged packets or the default priority for untagged packets for the ingress port. Examples The following commands set up priority queues for packets sent to the CPU and then use filters with policing to direct packets into these queues and limit their bandwidth.
specified numerically for either out-of-profile or in-profile actions. The numeric value is a decimal integer action code shown in the table below. If the action requires a parameter, the parameter value is multiplied by 256 and added to the action code. Only a few of the actions are possible for out-of-profile. All can be used for in-profile. Policing Actions Action Code Out Action Set 802.
for a u16 match. In many cases, there is a field name that can be used for the match, eliminating the need to specify the offset. U match selectors Field Match Equivalent ip src a.b.c.d/n u32 at 12 ip dst a.b.c.
OpenArchitect switch though, because the normal case is for packets to be switched in hardware. For that reason, zqosd must be used to shadow tc configuration into hardware. Like zfilterd, zqosd works with ztmd, which provides the actual hardware interaction.
In tc, the prio queuing discipline establishes multiple queues and specifies their associated priority map. Although WRR support is not part of the standard tc distribution, it has been added to the prio disciplinE. The following example illustrates WRR. A strict priority scheduler is a simpler case that can be constructed easily from this example. Examine the existing CoS settings on the switch, noting the number of queues per port, queue sizes, scheduling parameters, and priority map.
many packets sent as queue 0, queue 2 will have four times as many, and queue 3 will have six times as many. wrr parameters are scaled such that the maximum value is no more than 15. values which would be 0 are set to 1: • Queue 0 has a weight of 1000 bytes • Queue 1 has a weight of 2000 bytes • Queue 2 has a weight of 4000 bytes • Queue 3 has a weight of 6000 bytes The remaining commands each define a packet-limited FIFO queue.
tc filter add dev zhp0 protocol arp parent 100:0 u32 match u32 2 0xffff at +4 flowid 100:30 Combining Queuing Disciplines Any of the queue length limiting disciplines can be used with the bandwidth management queue disciplines, by defining them with the handle of one of the classes as their parent. For the htb queueing discipline, each class has an explicit handle specified when it is defined.
PDP PEP PEP PEP Figure 7.6: COPS Network Architecture A PDP contains all of the policy rulers for its associated PEPs. A PDP typically stores rules in a data and is a dedicated server, not a forwarding device. A PEP is any network device that has to enforce policy decisions. For example, a switch that restricts network access or prioritizes traffic fits the definition of a Policy Enforcement Point. A PEP makes no policy decision. It simply applies policy that receives from its PDP.
and relaying those requests to its PDP. By contrast, the provisioning model is based on longer lasting policy. The expectation is that policy should be administratively defined at the PDP and pushed to the PEPs as needed. OpenArchitect is a COPS-PR client. The most common use of COPS-PR is for distributing Differentiated Services (Diffserv) policy. Diffserv is concerned with such Quality of Service elements as queues and schedulers. OpenArchitect PEP The OpenArchitect PEP implementation is known as pepd.
where, PDP address: The IP address of the PDP. Default is loopback (127.0.0.1) PDP port: The destination port on which to open a COPS connection. Default is 3288. PEPID: The PEP Identifier Role-If: A mapping of roles to interfaces. The name of the role is followed by a comma-delineated list of interfaces. Multiple roleinterface mappings are defined through multiple Role-If declarations. Ethernet Switch Blade User's Guide release 3.2.
Chapter 8 Base Switch Administration One of the main benefits of the OpenArchitect switch is that it runs Linux, so much of the switch administration is already familiar to most network or system administrators. It is a good idea to complement these instructions with a standard Linux reference guide, such as Linux Network Administrator’s Guide available from O’Reilly. Below are brief descriptions of some of the more routine administrative task pertinent to the switch.
ZX6000-OA# zsync ZX6000-OA# Setting up a Default Route If you wish to access the switch from some place other than a directly attached network, you may want to setup a default route. Use the route command to set a default gateway. route add default gw 10.0.0.254 Put the entry into the /etc/init.d/rcS startup script to automatically set a default route upon reboot. Name Service Resolution Name service lookups will be done locally using /etc/hosts.
Network Time Protocol (NTP) Client Configuration NTP is a protocol for setting the real time clock on a system. There are numerous primary and secondary servers available on the network. For more NTP information, and a list of available NTP servers, see the following URL: http://www.ntp.org/ You will need to have your network settings properly configured to reach an available NTP server on your local network or the Internet. To set the time and date, execute ntpdate with the server of your choice.
All the necessary servers are included in /etc/init.d/rcS but are commented out by default. To automatically start all NFS client services each time you boot, uncomment the NFS Client servers. Go to the /etc/init.d/rcS file. Uncomment the following command lines by removing the # sign. /sbin/portmap /sbin/rpc.statd /usr/sbin/rpc.mountd -r You can also include commands to mount remote NFS file systems at boot time. There is an example line included at the appropriate location in /etc/init.d/rcS.
ftpd Server Configuration The switch itself can also be configured to run an FTP server (ftpd). See the Linux Reference Guide for details of the ftpd command. You will need to add a user to the switch in order to connect via ftp from a remote host, since root is not allowed ftp access. See the earlier section in this chapter regarding how to add a user. The ftp daemon is started by default. If you wish to shutdown the ftp daemon, comment out the betaftpd line in /etc/init.d/rcS.
Supported MIBS RFC 1155: Structure and Identification of Management Information for TCP/IP-based internets RFC 1227: SNMP MUX Protocol and MIB RFC 1493: Definitions of Managed Objects for Bridges (obsoletes RFC 1286) RFC 1657: Definitions of Managed Objects for the Fourth Version of the Border Gateway Protocol (BGP-4) using SMI-V2 RFC 1724: RIP Version 2 MIB Extension (obsoletes RFC 1389) RFC 1850: OSPF Version 2 Management Information Base (obsoletes RFC 1253, which obsoletes RFC 1252, which obs
Supported MIBS ZNYX Networks Private MIB Custom ZNYX MIB to support software and hardware features not covered by standard MIBs. The Private MIBs are ZX7100BASE.MIB AND ZX7100FABRIC.MIB, pointed to by ZNYX-H.MIB. UCD-SNMP Enterprise MIB UCD-SNMP MIB related to management and monitoring of the LINUX host Table 8.
status is down, then the operational status will be down independent of the underlying link state. You must ifconfig up the zres to see the operational link status for a zre. When the administrative status is up, the operational status is dependent on the underlying physical state. For example, Table 8.3 shows that if zhp0 contains zre1 and zre2, the it would also be true for the operational status (given the administrative status is up on zre1, zre2, and zhp0).
IMPORTANT: For NET-SNMP agents, these objects (sysLocation.0, sysContact.0 and sysName.0) ordinarily are read-write. However, specifying the value for one of these objects by giving the appropriate token in snmpd.conf makes the corresponding object read-only, and attempts to set the value of the object will result in a notWritable error response. The processing for link up and link down traps is now user configurable.
zmirror mirror_from mirror_to After executing the following three commands, packets received on ports 0, 1 and 2 would be mirrored (copied and transmitted) to port 12. This mirroring would be in addition to any Layer 3 or Layer 2 switching. zmirror zre0 zre12 zmirror zre1 zre12 zmirror zre2 zre12 To clear the current mirroring use the -t option. The -e option can be used to indicate that packets being sent on a given port should be copied to the mirror_to port.
Chapter 9 Base Switch Maintenance This chapter includes basic information about the OpenArchitect switch environment including an overview of the file system structure, modifying and updating switch files, upgrading the switch driver and kernel, and implementing a system recovery.
Bootloader examines the bootstring in the boot ROM Determines if the boot string is dev1 Yes Loads image from Flash 1 to RAM Yes Loads image from Flash 2 to RAM No Determines if the boot String is dev2 No Begins execution of RAM image Boot into zmon bootloader Figure 9.2: Booting up Process Flow Under normal circumstances, the booting up process follows the process outlined in Figure 6-2.
/etc/init.d/rcS /etc/rcZ.d/rc S* S* S* Figure 9.3: Init Script Flow Saving Changes Any modifications made to the scripts for your particular configuration must be properly saved or your changes are lost when you reboot. The file system for the switch only exists in memory. A rewritable overlay is contained within the upper four megabytes of the first application flash.
Booting with the –i option If you cannot telnet into the switch and Linux fails to boot, it is likely that a change saved by zsync has left the switch in an inaccessible state. To allow users to recover from mistakes saved in the overlay file system, a boot argument of –i passed to the init process will stop the untarring of the saved overlay files. As a result, the system boots to the factory-shipped configuration. • Connect through the console port.
System Hangs During Boot After attaching the system console cable, if the system hangs during boot, try booting with the –i option as described in the previous section. It is possible that important Linux system files became corrupted and incorrectly saved in the flash overlay. Use zmnt as described in the previous section to fix or remove the problem files from the overlay. If the system will not boot with the –i option, refer to Booting the Duplicate Flash Image section in this chapter.
the limit of free space available on a default system, so you may need to clear some space prior to downloading the new OpenArchitect image to the switch. CAUTION: Do not remove the existing copy of /usr/sbin/gated (as suggested in Step 5, below) until you have, in fact, determined that an OpenArchitect upgrade version is available for downloading. 5. One of the easiest ways to create free space is to remove /usr/sbin/gated, as the application will be replaced during the update procedure.
Using apt-get apt-get is a utility created by the Debian Linux community to allow remote fetching and installation of software stored in a repository in Debian package format. It allows users to keep their software up-to-date with the latest binaries, and install new software without the need to recompile. Users may create their own repositories and add entries in /etc/apt/sources.list ( empty by default ) for their private access methods to their private repository. See http://www.debian.
Chapter 10 Connecting to the Ethernet Switch Blade The Ethernet Switch Blade has two completely separate switching subsystems within one ATCA blade supporting both Base Interface and Fabric Interfaces Figure 10.1: Fabric and Base The Ethernet Switch Blade implements an independent control processor and software environment for both Base and Fabric Interface switching subsystems. Troubleshooting problems are similar in both environments.
console port. An RS-232 to RJ-45 adapter is required. Fabric Interface Hub System: A 48-port Gigabit Ethernet Switch that provides PICMG 3.1 Option 2 (2.0 Gb/s) Ethernet service for a full 14-slot ATCA chassis. All connectors for the fabric interface hub and it’s processor are labeled “fabric”. Ethernet Interfaces: The 3.1 Fabric Interface switching system provides 48 ports of Gigabit Ethernet service with PICMG 3.1 option 2 (2.0 Gb/s) links for all line cards installed and option 3 (4.
Figure 10.2: Base Interface Serial Port To attach the console cable to the Ethernet Switch Blade switch: 1. Plug the RJ-45 end of the console cable (P/N 6900-63006, shipped with the HP bh5700 ATCA 14-Slot Blade Server) into the RJ-45 Console Port (1) on the front panel. 2. Connect the DB-9 end of console cable into a standard Modem Eliminator Cable (normally locally available). 3.
NOTE: The OOB port is not active by default with the factory configured configuration. The first time you log into the switch either in-band or through the console cable you must use the ifconfig command to make the port active. Connecting to the Fabric Interface Fabric Interface Serial Port Connection The switch console can be accessed via one RJ-45 10/100 serial port (3) located on the front panel of the Ethernet Switch Blade. The RS-232 RJ-45 console port may be used to recover from a system failure.
9. Reinsert the switch into the system and power up. 10. Use a terminal emulation program to access the switch console. Fabric Interface Out of Band Ethernet Connection Connect an Ethernet cable from the Ethernet Switch Blade front panel MGMT OOB (4 in Figure 10.3) to your PC. 1. Configure a host on the 10.0.0.0 network. 2. The OpenArchitect switch is preconfigured with address 10.0.0.42. telnet to 10.0.0.42. telnet 10.0.0.42 3. After you are connected, enter the login name root. No password is required.
Chapter 11 Diagnosing a Failed Ethernet Switch Blade Activation Figure 11.1: Ethernet Switch Blade Activation States The Ethernet Switch Blade must transition through a series of states (M0–M4) to become active in an ATCA shelf. After the Ethernet Switch Blade has reached the M4 state, it will become active and start the boot process of the OpenArchitect Switch Management environment.
FRU State HotSwap LED Status M0 OFF Healthy LED Status OFF Solution No power. Board not inserted correctly. 1. Remove and re-insert board. 2. If board does not power-up after re-insertion, try a different slot. If board continues to fail in the new slot and the problem does not affect other boards running in the chassis, return the Ethernet Switch Blade board for repair. M1 ON ON No Communication with the Shelf Manager. Make sure the hot swap ejector handle is securely closed.
FRU State HotSwap LED Status Healthy LED Status Solution switch through a console cable. If OpenArchitect is running, and abnormal behavior is occurring, please see Network Configuration Problems for information on network issues. If OpenArchitect cannot be accessed through the console port, please see Troubleshooting a Failed OpenArchitect Load. Table 11.
sensor information. Examine the System Event Log (SEL) on the ShMM and determine if critical sensor events have been logged for the switch in question. If the switch has reported critical sensor data for temperature or voltage, the ShMM can prevent it from booting. To determine if the critical sensor events persist, it may be necessary to alter the rules enforced by the ShMM to allow the switch to receive back-end power and boot (see the ShMM documentation for instruction).
clia board -v 7 or clia board -v 8 These commands generate an output that reports if the ShMM thinks it has granted access to ports on the switches. Check the Shelf Manager User’s Guide for the expected output. Ethernet Switch Blade User's Guide release 3.2.
Chapter 12 Troubleshooting a Failed OpenArchitect Load The OpenArchitect operating system is loaded from the FlashROM memory into RAM when the Ethernet Switch Blade is activated by the Shelf Manager. If there is a problem with the loading of OpenArchitect due to a hardware failure or corrupt file system, the back-up image can help to troubleshoot the condition. The following chapter provides tips to troubleshooting a failed OpenArchitect load. Ethernet Switch Blade User's Guide release 3.2.
Ethernet Switchblade has been enabled by the ShMM and starts to boot Bootloader examines the bootstring in the Boot ROM Determines if the bootstring is dev 1 Determines if the bootstring is dev 2 Loads image from Flash device 1 Loads image from Flash device 2 Cannot boot OpenArchitect Begins execution of RAM image Figure 12.
Figure 12.2: ROM Devices in OpenArchitect The boot ROM is located on device 0 and contains the OpenArchitect zmon application that operates as a boot loader and includes a device bootstring. Device 1 contains the application flash 1 image of the Linux operating system and the OpenArchitect overlay file system. Application flash 1 is the primary working image for the switch. Device 2 contains the application flash 2 that is an exact copy of application flash 1.
properly attach the console cable. Booting Without the Overlay File If you cannot telnet into the switch and Linux fails to boot, it is likely that a change saved by zsync has left the switch in an inaccessible state. To allow users to recover from mistakes saved in the overlay file system, a boot argument of –i passed to the init process will stop the untarring of the saved overlay files. As a result, the system boots to the factory-shipped configuration. 1. Connect through the console port.
If the switch still is unable to boot, see Booting the Duplicate Flash Image, below. Booting the Duplicate Flash Image Another recovery method, if Linux fails to boot, is to temporarily boot the factory-installed duplicate image located in the second flash device. 1. Connect through the console port. 2. When you see the number counter appear after the zmonitor ... banner, press any key on the console keyboard to enter the zmon application. 3. At the monitor prompt, type: boot:2 4.
Chapter 13 Network Configuration Problems Many reported problems on a booted switch will ultimately be traced back to user errors in the layer 2 or layer 3 switch configuration. In some cases, symptoms from an improperly configured switch can masquerade as potential hardware problems. Interface Overview On startup OpenArchitect creates interfaces for all Ethernet ports on the Ethernet Switch Blade.
Physical Slot 1 2 Fabric Port 3 3 4 5 6 7 8 9 10 11 12 19 11 - - 3 7 15 27 13 14 15 16 51** Fabric * Base Interface Inter-Switch Link (ISL) ** 10 Gigabit Ethernet Fabric Interface - Update Channel Table 13.
2. S30e1000 - Script that loads the e1000 driver module for the Out-of-Band Ethernet ports. (Editing this script is not recommended.) S40vpd - Script that checks the current OA version, and loads into the Vital Product Data (VPD) area if necessary. (Editing this script is not recommended.) 3. S50layer2 - Script that sets up a basic Layer 2 switch. All 24 10/100/1000 ports are set up on one IP network (VLAN). Figure 13.1: Default Base Interface Network Diagram Ethernet Switch Blade User's Guide release 3.
OpenArchitect login: root sh-2.04# ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16144 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) zhp0 Link encap:Ethernet HWaddr 00:11:65:09:E0:18 inet addr:10.0.0.42 Bcast:10.0.0.255 Mask:255.255.255.
Figure 13.2: Linux Networking Environment Interfaces ifconfig Default Screen Output for the Base Interface [ZX7100-OA3.2.2h]# ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16144 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) zhp0 Link encap:Ethernet HWaddr 00:11:65:0B:C0:38 inet addr:10.0.0.42 Bcast:10.0.0.255 Mask:255.255.255.
Configuration Troubleshooting Problem Solution No Connection Physical Link problem. Check to see if the port LED is lit. If the LED port is not lit, then you may have a bad cable connection. OR Configuration Error. Connect through the console port (See Chapter 10). Use the ifconfig command to see all of the configured interfaces on the Ethernet Switch Blade and their IP addresses. Check to make sure that the IP address of the switch and the subnet mask are properly set.
The following table will translate the zlc output to link status. Link Zre (x) Port Status EKEY_DISA BLED Link Speed Auto Pause Enable EKEY_ENAB 1000fd LED Faults Internal Fault OK ON Disable External Fault UP 1000hd DOWN 100fd 100hd 10fd 10hd Link: zre(X) – physical interface Shelf Manager Status: EKEY_DISABLED - A slot or device that has been disabled by the Shelf Manager.
10hd – Ethernet Half Duplex Pause: Enable: a port that can temporarily suspend the data transmission between two network devices in the event that one of the devices becomes congested. Pause enabled devices can reduce bottlenecks by making the network more efficient. Disabled: The pause feature is not enabled and will continue to transmit traffic when even when the receiving device is busy.
sh-2.04# zlc zre0..
[ZX7100-OA3.2.2h]# zlc zre0..
Network Connectivity Troubleshooting No Connection If the port LED is lit on the front panel, the switch has established a physical connection and the problem is a network configuration error. Check to see if both devices are configured to be on the same network (ex. 10.0.0.xxx) and that the subnet mask is set correctly. Diminished Network Throughput Depending on how the switch is configured, throughput problems can reflect configuration errors in the network topology.
have an active remote device attached, then first bring down the ports which do not have active connections expected to make sure there is a legitimate EXT FLT condition. If loss of communications is suspected on an externally wired port, make sure to check and test affected cables. Network Tests Ping Test It is possible to test a network connection by using the ping command. The ping command will send a network packet to the specified IP address and wait for a reply.
Traceroute Test It’s possible to trace a network path using the traceroute command. The following is an example of a Layer 2 traceroute with only two devices. sh-2.04# traceroute 192.168.1.101 traceroute to 192.168.1.101 (192.168.1.101), 64 hops max, 40 byte packets 1 192.168.1.101 (192.168.1.101) 1.888 ms 1.135 ms 0.814 ms sh-2.04# Ethernet Switch Blade User's Guide release 3.2.
Chapter 14 Isolating Hardware Failures Figure 14.1: ATCA Base Inside View 1. Flash 10. Switch Chip (U69) 2. EEPROM 11. Zone 3 ATCA Connector 3. PHY 12. Isolation Transformers 4. CPU 13. 4-port PHY 5. SDRAM 14. Zone 2 ATCA Connector 6. Isolation Transformer 15. Zone 1 ATCA Connector 7. IPMI Controller 16. Isolation Transformers 8. Power Supply 17. 4 port PHY 9. Switch Chip (U56) 18. Fuses Ethernet Switch Blade User's Guide release 3.2.
Figure 14.2: ZMC Daughter Card Outside View 1. Isolation Transformer 2. Zone 3 ATCA Connector 3. Isolation Transformer 4. Switch Chip (U60) 5. SDRAM 6. Switch Chip (U59) 7. Isolation Transformer Ethernet Switch Blade User's Guide release 3.2.
Figure 14.3: ZMC Daughter Board Inside View 1. Isolation Transformer 8. Flash ROMs 2. 4 Port PHY 9. FPGA 3. CPU (U22) 10. ZMC Connector 4. 10 Gigabit XFP 11. Zone 3 ATCA Connector 5. 10 Gigabit PHY 12. Power Supply Ethernet Switch Blade User's Guide release 3.2.
6. Isolation Transformer 13. Isolation Transformers 7. Power Supply 14. 4 Port PHY Hardware Subsystem In the following tables, refer to the identified component-area numbers on indicated in the pictures in the proceeding section. The indications of malfunction may be identified either during normal operation, or in response to a specific test. The various tests that may be initiated are shown in subsequent sections.
Base ZMC 0 ZMC 1 # # Hardware Subsystem Indications of Malfunction any of the following indications: • Error message via OpenArchitect due to inability to access the registers within the switch chip, or a failure of DMA transfers. • Loss of switch functionality, such as the inability to forward packets, or forwarding packets in error. 8 Power Supply 12 3, 6, 12, 2, 4, 6, 13, 16, 13, 15 17 9 1, 2, 4 A power-supply failure will generally result in lack of boot activity.
Duplicate Flash Image. If the switch can successfully boot from FlashROM device 2, then FlashROM device 2 is fully operational. Testing the Switch Fabric You can test the functionality of the switch fabric by running the zlc command. The zlc command outputs the link status for any Ethernet Switch Blade interface. Link Status for a single port To query a link status for a single port type zre query for example: zlc zre13 query Example Output: sh-2.
Example Output: sh-2.04# zlc zre0..
If the “Used” and “Free” memory statistics do not add up to the Total memory, the software environment may have a memory leak caused by a software error. Reboot the switch. If the problem persists after a reboot. Run the top command to list the memory utilization of all current processes. sh-2.04# top The top command can help you isolate software related memory problems to specific processes.
To test the operational status of the control processors you can do the following: Hardware Fault Connect to the console port of either the Base or Fabric Interface control processor (See Chapter 10 for more information). If you cannot communicate with the Ethernet Switch Blade, the control processor may have encountered a software error. Reboot the switch to clear the error.
INT FLT LED is illuminated, replace the switch and return it for repair. Ethernet Switch Blade User's Guide release 3.2.
Chapter 15 High Availability Troubleshooting The ATCA environment will usually contain a high-availability failover configuration between two ATCA switches in the chassis. Note that the failover features are configurable and a switch can be directed to fail over all of its processing when a single port or link goes down, or it can perform a port-to-port or VLAN-to-VLAN failover where both partner switches are still processing a portion of the network traffic.
Chapter 16 Switch Firmware Overview There are three components to the firmware on the Ethernet Switch Blade: 1. Bootloader firmware (zmon) 2. OpenArchitect firmware 3. IPMI firmware Some hardware and software problems can be resolved by updating the firmware to the latest version. Check the Hewlett-Packard website for the latest version (see the HP 5700 ATCA 14Slot Blade Server Installation Guide).
Key: PN: Base Interface Switch Assembly Number SN: Base Interface Switch Serial Number V6: OpenArchitect Version Number VP: IPMI Firmware Version VZ: BootLoader Version Number The following output is shown for the 3.1 Fabric Interface: 3.1 Fabric Interface [ZX7100-OA3.2.
Updating the Switch Firmware Currently, the OpenArchitect and bootloader components are the only upgradeable firmware on the Ethernet Switch Blade. Upgrading the IPMI software is not currently supported. BootLoader Firmware Upgrade: 1. Download the bootloader image to a local system. 2. FTP the bootloader image from the local system to your switch. 3. Use the zflash command to write the new elgoro/zmon image into the boot flash device. Be sure and use device 0, not device 1 or 2.
Surviving Partner daemons to think there is a failure, resulting in link oscillation. Base Interface: zflash -d 1 rdr6000.zImage.initrd Fabric Interface: zflash –d 1 rdr7100.zImage.initrd IPMC Firmware Upgrade: Upgrading the IPMC Firmware through OpenArchitect is not currently supported. Ethernet Switch Blade User's Guide release 3.2.
Chapter 17 Restoring the Factory Default Configuration You should use this procedure if the contents in Flash Device 1 are corrupt and you need to restore the switch to the factory default configuration. By restoring the factory default configuration, you will overwrite your main file system in Flash Device 1 and lose all previous configuration changes. IMPORTANT: Make sure that Surviving Partner is not running before using zflash.
Chapter 18 Before Calling Support Because of the highly customized configurations that can be applied by customers to their ATCA switch environment, the focus must be on data collection to get a snapshot of the current switch configuration and network traffic activity. If support is needed, it is necessary to gather the following information for further diagnosis before calling support: 1.
Offset 0 Offset 0 zmon Free space Application Flash 2 on Device 2 Application Flash 1 on Device 1 Boot ROM on Device 0 initrd Linux and its file system initrd (exact copy as in Application Flash 1) Linux and its file system Offset 7f000 dev bootstring Free space Free space overlay file system Figure 18.1: ROM Devices in OpenArchitect The boot ROM is located on device 0 and contains the OpenArchitect zmon application that operates as a boot loader and includes a device bootstring.
Appendix A Fabric Switch Command Man Pages OpenArchitect applications are implemented above the OpenArchitect libraries and the RMAPI interface. OpenArchitect applications are used for normal operation of the switch, for runtime status and diagnostics, and for prototyping new applications development. For runtime operation, the OpenArchitect applications perform initialization and configuration, and real-time control and maintenance of the switching tables in the switch silicon.
vrrpconfig NAME vrrpconfig – Configure and control the running vrrpd SYNOPSIS vrrpconfig [-d ] -- vrrpconfig [-d ] [-k] [-a] [-p] [-s ] DESCRIPTION vrrpconfig provides communication with a running vrrpd daemon. The -- option for vrrpconfig will pass all parameters to vrrpd as would be done when starting the vrrpd. Any output generated by vrrpd is displayed on the vrrpconfig controlling tty.
EXAMPLES Here is an example of using the -- invocation method that changes the priority to 99 for the Virtual Router associated with the Virtual Router Identifier 1: vrrpconfig -- -v 1 –p 99 SEE ALSO vrrpd Ethernet Switch Blade User's Guide release 3.2.
vrrpd NAME vrrpd – Virtual Router Redundancy Protocol Daemon SYNOPSIS vrrpd -i ifname -v vrid [-f piddir] [-s] [-a auth] [-p prio] [-nhb] [-I ifname] [-d delay] [-m address] [-M ] [-B] [-S script] [-c conf_file] [-D level] ipaddr DESCRIPTION vrrpd is an implementation of Virtual Redundant Routing Protocol (VRRPv2) as specified in RFC2338. It runs in Linux user space. In short, VRRP is a protocol that elects a Master server on a LAN to which the Master answers to a virtual IP address.
the –i option. -s Toggle preemption mode (Enabled by default). Preemption means that a Master switch will go to Backup if a current Backup has higher priority. -M Become MASTER when priority is equal. Be sure it is only set on one host or the switches will oscillate. Must set –B option on other hosts (requires preemption mode ! -s) -B Become BACKUP when priority is equal. -S