Node Provider Networking Troubleshooting Guide

From Internet Computer Wiki
Jump to: navigation, search

This page is designed to guide you through common Node Provider Networking issues and processes.


How to check the port status of a deployed node

To verify the port status of the deployed node on the switch, follow these steps:

  1. Identify the switch: Determine the switch to which the node is connected.
  2. Access the switch: Use a console cable or remote management interface (SSH, Telnet, etc.) to connect to the switch.
  3. Log in to the switch: Enter the appropriate credentials (username and password) to access the switch's command line interface.
  4. Identify the port: Determine the port on the switch to which the node is connected. This information may be provided during the deployment or can be obtained by physically tracing the network cable.
  5. Check the port status: Use the below commands to check the status of the specific port on the switch depending on the platform you use.
  6. Analyze the output: The command output will provide details about the port's status, including its operational state, link status, speed, duplex mode, and any error or drop counters. Look for the following key information:
    • Operational state: It should be "up" or "connected" for the port to be active.
    • Link status: It should indicate "up" for the port to have a functional connection.
    • Speed and duplex: Verify that the configured speed and duplex settings match the expected values.
    • Error counters: If there are a high number of errors or drops, it may indicate issues with the connection.
  7. Troubleshooting: If the port status is not as expected or indicates any issues, you can perform further troubleshooting steps. Some common troubleshooting actions include checking the physical cable connections, restarting the node and switch, verifying VLAN configurations, and ensuring the switch port configuration matches the requirements of the node.


Command examples:

Cisco Nexus

switch# show interface status

--------------------------------------------------------------------------------
Port          Name               Status    Vlan      Duplex  Speed   Type
--------------------------------------------------------------------------------
mgmt0         --                 connected routed    full    1000    --

--------------------------------------------------------------------------------
Port          Name               Status    Vlan      Duplex  Speed   Type
--------------------------------------------------------------------------------
Eth1/1        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/2        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/3        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/4        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/5        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/6        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/7        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/8        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/9        Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/10       Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/11       Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/12       Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/13       Server:WAN         connected 1         full    10G     10Gbase-SR
Eth1/14       Server:WAN         connected 1         full    10G     10Gbase-SR
..
switch# show interface ethernet 1/1
Ethernet1/1 is up
admin state is up, Dedicated Interface
  Hardware: 1000/10000 Ethernet, address: 0cb4.0000.0101 (bia 0cb4.0000.0101)
  Description: Server:WAN
  MTU 1500 bytes, BW 10000000 Kbit , DLY 10 usec
  reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, medium is broadcast
  Port mode is access
  full-duplex, 10 Gb/s, media type is 10G
  Beacon is turned off
  Input flow-control is off, output flow-control is off
  Rate mode is dedicated
  Switchport monitor is off
  EtherType is 0x8100
  Last link flapped 00:01:14
  Last clearing of "show interface" counters never
  4 interface resets
  Load-Interval #1: 30 seconds
    30 seconds input rate 0 bits/sec, 0 packets/sec
    30 seconds output rate 296 bits/sec, 0 packets/sec
    input rate 0 bps, 0 pps; output rate 296 bps, 0 pps
  Load-Interval #2: 5 minute (300 seconds)
    300 seconds input rate 0 bits/sec, 0 packets/sec
    300 seconds output rate 200 bits/sec, 0 packets/sec
    input rate 0 bps, 0 pps; output rate 200 bps, 0 pps
  RX
    0 unicast packets  0 multicast packets  0 broadcast packets
    0 input packets  0 bytes
    0 jumbo packets  0 storm suppression packets
    0 runts  0 giants  0 CRC  0 no buffer
    0 input error  0 short frame  0 overrun   0 underrun  0 ignored
    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop
    0 input with dribble  0 input discard
    0 Rx pause
  TX
    125 unicast packets  127 multicast packets  110 broadcast packets
    362 output packets  72269 bytes
    0 jumbo packets
    0 output error  0 collision  0 deferred  0 late collision
    0 lost carrier  0 no carrier  0 babble  0 output discard
    0 Tx pause

Dell OS10

OS10# show interface status

--------------------------------------------------------------------------------------------------
Port            Description     Status   Speed    Duplex   Mode Vlan Tagged-Vlans
--------------------------------------------------------------------------------------------------
Eth 1/1/1       Server:WAN      up       10G      full     A    1    -
Eth 1/1/2       Server:WAN      up       10G      full     A    1    -
Eth 1/1/3       Server:WAN      up       10G      full     A    1    -
Eth 1/1/4       Server:WAN      up       10G      full     A    1    -
Eth 1/1/5       Server:WAN      up       10G      full     A    1    -
Eth 1/1/6       Server:WAN      up       10G      full     A    1    -
Eth 1/1/7       Server:WAN      up       10G      full     A    1    -
Eth 1/1/8       Server:WAN      up       10G      full     A    1    -
Eth 1/1/9       Server:WAN      up       10G      full     A    1    -
Eth 1/1/10      Server:WAN      up       10G      full     A    1    -
Eth 1/1/11      Server:WAN      up       10G      full     A    1    -
Eth 1/1/12      Server:WAN      up       10G      full     A    1    -
Eth 1/1/13      Server:WAN      up       10G      full     A    1    -
Eth 1/1/14      Server:WAN      up       10G      full     A    1    -
Eth 1/1/15      Server:WAN      up       10G      full     A    1    -
...
OS10# show interface ethernet 1/1/1
Ethernet 1/1/1 is up, line protocol is up
Description: Server:WAN
Hardware is Eth, address is 0c:a6:36:d9:00:01
    Current address is 0c:a6:36:d9:00:01
Pluggable media present, RJ45 type is 10GBASE-T-RJ45
    Wavelength is 0
Interface index is 16
Internet address is not set
Mode of IPv4 Address Assignment: not set
Interface IPv6 oper status: Disabled
MTU 9216 bytes, IP MTU 9184 bytes
LineSpeed 10G, Auto-Negotiation on
Flowcontrol rx on tx off
ARP type: ARPA, ARP Timeout: 60
Tag Protocol IDentifier (TPID) value: 0x8100
Last clearing of "show interface" counters: 00:06:49
Queuing strategy: fifo
Input statistics:
     0 packets, 0 octets
     0 64-byte pkts, 0 over 64-byte pkts, 0 over 127-byte pkts
     0 over 255-byte pkts, 0 over 511-byte pkts, 0 over 1023-byte pkts
     0 Multicasts, 0 Broadcasts, 0 Unicasts
     0 runts, 0 giants, 0 throttles
     0 CRC, 0 overrun, 0 discarded
Output statistics:
     0 packets, 0 octets
     0 64-byte pkts, 0 over 64-byte pkts, 0 over 127-byte pkts
     0 over 255-byte pkts, 0 over 511-byte pkts, 0 over 1023-byte pkts
     0 Multicasts, 0 Broadcasts, 0 Unicasts
     0 throttles, 0 discarded, 0 Collisions,  wred drops
Rate Info(interval 30 seconds):
     Input 0 Mbits/sec, 0 packets/sec, 0% of line rate
     Output 0 Mbits/sec, 0 packets/sec, 0% of line rate
Time since last interface status change: 00:01:37

Cumulus

cumulus@cumulus:mgmt:~$ net show interface
State  Name     Spd  MTU    Mode       LLDP                           Summary
-----  -------  ---  -----  ---------  -----------------------------  --------------------------
UP     lo       N/A  65536  Loopback                                  IP: 127.0.0.1/8
       lo                                                             IP: ::1/128
UP     eth0     1G   1500   Mgmt                                      Master: mgmt(UP)
       eth0                                                           IP: 192.168.1.10/24(DHCP)
UP     swp1     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)       
UP     swp2     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp3     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp4     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp5     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp6     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp7     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp8     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp9     10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp10    10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
UP     swp11    10G  9216   Access/L2  host-xxxxxxxxxxxx (ens2f1np1)  Master: bridge(UP)
...
UP     bridge   N/A  9216   Bridge/L2
UP     mgmt     N/A  65536  VRF                                       IP: 127.0.0.1/8
cumulus@cumulus:mgmt:~$ net show interface swp2
    Name  MAC                Speed  MTU   Mode
--  ----  -----------------  -----  ----  ---------
UP  swp2  0c:e1:54:56:00:02  10G    9216  Access/L2

All VLANs on L2 Port
--------------------
1

Untagged
--------
1

cl-netstat counters
-------------------
RX_OK  RX_ERR  RX_DRP  RX_OVR  TX_OK  TX_ERR  TX_DRP  TX_OVR
-----  ------  ------  ------  -----  ------  ------  ------
    1       0       0       0     36       0       0       0

LLDP Details
------------
LocalPort  RemotePort(RemoteHost)
---------  ----------------------------
swp2       ens2f1np1(host-xxxxxxxxxxxx)

Routing
-------
  Interface swp2 is up, line protocol is up
  Link ups:       1    last: 2023/07/14 07:04:29.71
  Link downs:     0    last: (never)
  PTM status: disabled
  vrf: default
  index 4 metric 0 mtu 9216 speed 1000
  flags: <UP,BROADCAST,RUNNING,MULTICAST>
  Type: Ethernet
  HWaddr: 0c:e1:54:56:00:02
  Interface Type Other
  Master interface: bridge
  protodown: off


How to check if the mac address of the server is set on the switch port

To check if the MAC address of the server is set on the switch port, you can use the following steps:

  1. Access the switch, login, and identify the port: Please use steps from the above guide
  2. View MAC address table: Use the below command to display the MAC address table on switch
  3. Check for the server's MAC address: Look for the MAC address of the server in the output of the previous command. The MAC address should be associated with the switch port where the server is connected.
  4. Verify MAC address learning: If you do not see the server's MAC address in the MAC address table, it means that the switch has not learned the MAC address from the server yet. In such cases, you can try the following:
    • Ensure the server is powered on and connected to the correct switch port.
    • Check the physical network connection, including the Ethernet/Fiber cable.
    • Verify if the server's network interface is functioning properly.
    • If the server is configured with a static MAC address, ensure it matches the expected MAC address.
  5. Further troubleshooting: If you encounter any issues, you can perform additional troubleshooting steps. This may involve checking the server's network configuration, examining the switch port configuration, verifying VLAN assignments, or investigating any network connectivity problems.


Command examples:

Cisco Nexus

switch# show mac address-table interface ethernet 1/1
Legend:
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link,
        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan
   VLAN     MAC Address      Type      age     Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
*    1     3612.407f.41d6   dynamic  0         F      F    Eth1/1

Dell OS10

OS10# show mac address-table interface ethernet 1/1/1
Codes: pv <vlan-id> - private vlan where the mac is originally learnt
VlanId        Mac Address         Type        Interface
1             36:12:40:7f:41:d6   dynamic     ethernet1/1/1

Cumulus

cumulus@cumulus:mgmt:~$ net show bridge macs dynamic

VLAN  Master  Interface  MAC                TunnelDest  State  Flags  LastSeen
----  ------  ---------  -----------------  ----------  -----  -----  --------
   1  bridge  swp2       36:12:40:7f:41:d6                            00:01:24


How to verify the IPv6 Neighbors from the gateway

To verify IPv6 neighbors from the gateway on Cisco, Dell OS10, and Cumulus network devices, you can use the following commands:

  1. Access the switch/router and log in
  2. Use the following command to view the IPv6 neighbors: The command output will display the IPv6 neighbors along with their IPv6 addresses, MAC addresses, and associated interfaces.

Command examples:

Cisco Nexus

switch# show ipv6 neighbor

Flags: # - Adjacencies Throttled for Glean
       G - Adjacencies of vPC peer with G/W bit
       R - Adjacencies learnt remotely
       CP - Added via L2RIB, Control plane Adjacencies
       PS - Added via L2RIB, Peer Sync
       RO - Re-Originated Peer Sync Entry
       CC - Consistency check pending

IPv6 Adjacency Table for VRF default
Total number of entries: 6
Address         Age       MAC Address     Pref Source     Interface         Flags
2a00:fb01:400:100::1
                00:03:25  0c94.ad2c.0000  50   icmpv6     Ethernet1/1
fe80::e94:adff:fe2c:0
                00:03:20  0c94.ad2c.0000  50   icmpv6     Ethernet1/1
2a00:fb01:400:200:2c31:77ff:fe28:1996
                00:00:39  2e31.7728.1996  50   icmpv6     Vlan10
2a00:fb01:400:200:949a:afff:fe31:b3d7
                00:00:24  969a.af31.b3d7  50   icmpv6     Vlan10
fe80::2c31:77ff:fe28:1996
                00:01:11  2e31.7728.1996  50   icmpv6     Vlan10
fe80::949a:afff:fe31:b3d7
                00:00:59  969a.af31.b3d7  50   icmpv6     Vlan10

Dell OS10

OS10# show ipv6 neighbors
Codes: pv <vlan-id> - private vlan where the mac is originally learnt
IPv6 Address                            Hardware Address    State       Interface                Egress Int
------------------------------------------------------------------------------------------------------------------
2a00:fb01:400:100::1                    0c:94:ad:2c:00:00   reachable   ethernet1/1/1
2a00:fb01:400:200:2c31:77ff:fe28:1996   2e:31:77:28:19:96   reachable   vlan10                   ethernet1/1/2
2a00:fb01:400:200:949a:afff:fe31:b3d7   96:9a:af:31:b3:d7   reachable   vlan10                   ethernet1/1/3
fe80::e94:adff:fe2c:0                   0c:94:ad:2c:00:00   reachable   ethernet1/1/1
fe80::2c31:77ff:fe28:1996               2e:31:77:28:19:96   reachable   vlan10                   ethernet1/1/2
fe80::949a:afff:fe31:b3d7               96:9a:af:31:b3:d7   reachable   vlan10                   ethernet1/1/3

Cumulus

cumulus@cumulus:mgmt:~$ net show neighbor ipv6
Neighbor                               MAC                Interface  AF    STATE
-------------------------------------  -----------------  ---------  ----  ---------
fe80::e94:adff:fe2c:0                  0c:94:ad:2c:00:00  vlan1      IPv6  STALE
2a00:fb01:400:100::1                   0c:94:ad:2c:00:00  swp4       IPv6  STALE
2a00:fb01:400:200:949a:afff:fe31:b3d7  96:9a:af:31:b3:d7  vlan1      IPv6  REACHABLE
2a00:fb01:400:200:2c31:77ff:fe28:1996  2e:31:77:28:19:96  vlan1      IPv6  REACHABLE
fe80::e94:adff:fe2c:0                  0c:94:ad:2c:00:00  swp4       IPv6  STALE
fe80::949a:afff:fe31:b3d7              96:9a:af:31:b3:d7  vlan1      IPv6  REACHABLE
fe80::2c31:77ff:fe28:1996              2e:31:77:28:19:96  vlan1      IPv6  REACHABLE
cumulus@cumulus:mgmt:~$


How to verify the connectivity using a server/laptop

By performing these steps, you can verify the connectivity using a server or laptop, ensuring the ability to ping the gateway, reach external IPv6 addresses, and resolve hostnames via DNS. This will confirm that your setup is ready for deployment:

  1. Connect the server or laptop to the network: Ensure that the server or laptop is connected to the network where the gateway is located. This can be done by connecting an Ethernet cable to the same switch where the IC nodes will be connected.
  2. Obtain IPv6 address information: Configure the server or laptop with an IPv6 address, either through manual configuration or automatic assignment SLAAC. Ensure that the IPv6 address is within the same subnet as the gateway.
  3. Ping the gateway: Use the following command to ping the IPv6 address of the gateway: ping6 <gateway IPv6 address>
    • Replace <gateway IPv6 address> with the actual IPv6 address of the gateway. This command will send ICMPv6 echo requests to the gateway and wait for a response.
    • If you receive successful replies, it indicates that the server or laptop can reach the gateway over IPv6.
    • If you encounter "Destination unreachable" or "Request timed out" messages, it suggests that there may be connectivity issues between the server or laptop and the gateway. Check the network configuration, ensure the gateway is reachable, and verify firewall settings.
  4. Ping an external IPv6 address: Test connectivity to an external IPv6 address to verify connectivity beyond the gateway. Use the following command: ping6 <external IPv6 address>
    • Replace <external IPv6 address> with the IPv6 address of a known external host, such as a public IPv6 address or another device on the internet. This will help determine if there is end-to-end IPv6 connectivity from the server or laptop.
    • As an example you can use Google and Cloudflare DNS IPv6:
      • ping6 2001:4860:4860:0:0:0:0:8888
      • ping6 2606:4700:4700::1111
    • If you receive successful replies, it confirms that the server or laptop can communicate with external IPv6 addresses.
    • If you encounter issues or failures, check for any firewall rules, routing problems, or potential network configuration issues that may be affecting connectivity.
  5. Resolve the NNS nodes:
    • To test DNS resolution, attempt to resolve the hostname "icp0.io, icp-api.io and ic0.app" to its IPv6 address. Use the following commands:
      • nslookup -query=AAAA icp0.io
      • nslookup -query=AAAA icp-api.io
      • nslookup -query=AAAA ic0.app
    • These commands query the DNS server for the AAAA record (IPv6 address) of the above domains.
      • If the resolution is successful, it will display the corresponding IPv6 address.
      • If you receive the IPv6 address, it confirms that the DNS resolution is functioning correctly.
      • If the resolution fails. verify the DNS configuration, check for any Firewall block, check routing, or consider checking the host file or DNS caching on the server or laptop.


How to verify the IPv6 routing from the gateway

To verify routing on the switch/gateway for IPv6, you can follow these steps:

  1. Access the switch/gateway: Connect to the switch/gateway using a console cable or remote management interface (SSH, Telnet, etc.), and log in with the appropriate credentials.
  2. Identify the routing table: Use the below command or method to view the IPv6 routing table on the specific switch/gateway platform. This can vary depending on the device and its operating system.

Command examples:

Cisco Nexus

switch# show ipv6 route vrf all
IPv6 Routing Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]

0::/0, ubest/mbest: 1/0
    *via 2a00:fb01:400:100::1/128, [1/0], 00:01:01, static
2a00:fb01:400:100::/126, ubest/mbest: 1/0, attached
    *via 2a00:fb01:400:100::3, Eth1/1, [0/0], 00:01:02, direct,
2a00:fb01:400:100::3/128, ubest/mbest: 1/0, attached
    *via 2a00:fb01:400:100::3, Eth1/1, [0/0], 00:01:02, local
2a00:fb01:400:200::/64, ubest/mbest: 1/0, attached
    *via 2a00:fb01:400:200::1, Vlan10, [0/0], 00:05:39, direct,
2a00:fb01:400:200::1/128, ubest/mbest: 1/0, attached
    *via 2a00:fb01:400:200::1, Vlan10, [0/0], 00:05:39, local

Dell OS10

OS10# show ipv6 route
Codes: C - connected
       S - static
       B - BGP, IN - internal BGP, EX - external BGP, EV - EVPN BGP
       O - OSPF, IA - OSPF inter area, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type 2, E1 - OSPF external type 1,
       E2 - OSPF external type 2, * - candidate default,
       + - summary route, > - non-active route
Gateway of last resort is via 2a00:fb01:400:100::1 to network ::/0
       Destination                                 Gateway                                                   Dist/Metric   Last Change
--------------------------------------------------------------------------------------------------------------------------------------
  *S    ::/0                                  via 2a00:fb01:400:100::1                ethernet1/1/1           1/0           00:02:44
  C     2a00:fb01:400:100::/126               via 2a00:fb01:400:100::3                ethernet1/1/1           0/0           00:00:23
  C     2a00:fb01:400:200::/64                via 2a00:fb01:400:200::1                vlan10                  0/0           00:02:56

Cumulus

cumulus@cumulus:mgmt:~$ net show route ipv6
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

S>* ::/0 [1/0] via 2a00:fb01:400:100::1, swp4, weight 1, 00:01:11
C>* 2a00:fb01:400:100::/126 is directly connected, swp4, 00:01:11
C>* 2a00:fb01:400:200::/64 is directly connected, vlan1, 00:01:08
C * fe80::/64 is directly connected, vlan1, 00:01:08
C * fe80::/64 is directly connected, bridge, 00:01:09
C>* fe80::/64 is directly connected, swp4, 00:01:11
cumulus@cumulus:mgmt:~$

Consult the device documentation or vendor resources for the exact command to view the IPv6 routing table on your specific device.

  1. Examine the routing table: Analyze the output of the routing table command to verify the presence of IPv6 routes. Look for routes that have an IPv6 destination address and the associated next-hop or outgoing interface.
    • If there are specific destination IPv6 networks listed in the routing table, it indicates that the switch/gateway has routing information for those networks.
    • If the routing table is empty or does not include the expected IPv6 routes, it suggests that there might be an issue with routing configuration or connectivity.
  2. Verify default route: Check if there is a default route present in the routing table. The default route, represented as "::/0", is used for forwarding IPv6 traffic when no specific matching routes are found. Ensure that there is a valid next-hop or outgoing interface associated with the default route.
    • If a valid default route is present, it ensures that the switch/gateway has a way to forward IPv6 traffic to destinations outside of its directly connected networks.
    • If there is no default route or an incorrect default route, it can cause connectivity issues for IPv6 traffic that doesn't match any specific routes.
  3. Troubleshoot routing issues: If there are any routing issues, you can perform troubleshooting steps such as:
    • Check the configuration of IPv6 routing protocols (e.g., OSPFv3, BGP) if used.
    • Verify that the switch/gateway has the necessary interfaces configured with IPv6 addresses.
    • Ensure that the routing table entries are correct and reflect the expected network topology.
    • Verify that neighboring routers have the appropriate IPv6 routing information.


Please note that the commands provided are general examples and may differ slightly depending on the specific device model and software version. Refer to the documentation or vendor resources for more precise command syntax and options for your particular network device.