Systemd-networkd-wait-online failure

On our Raspberry PI 4 (RPI4) running Ubuntu server 22.04, there are some network issues I believe is related to Sixfab Core.

I started with our base system, network over a WiFi connection was working fine. Once I installed Sixfab CORE, the EC25 device on the RPI hat with Sixfab SIM connected with no problems, and the Sixfab CONNECT portal showed all the correct system information.

But I noticed certain network activities like software updates and docker builds would fail consistently with various network timeouts. I went through the process of starting with a new system and installing CORE twice, and results were identical in both cases. Details below, but it appears that somehow the CORE software is causing systemd-networkd-wait-online task to fail. Even if I disconnect the USB cable to the RPI hat, the problem remains.

I searched the community forums for similar failures, but did not see anything specific

https://community.sixfab.com/t/install-finished-dashboard-shows-no-gconnectivity/1729, \
https://community.sixfab.com/t/cellular-connection-unavailable/1283
https://community.sixfab.com/t/cellular-connection-breaks-down-when-i-set-up-a-wlan-bridge/2405/2

Specifics:

  • RPI 4
  • Ubuntu 22.04 with latest updates
  • We run 4 docker containers with appropriate network interfaces, although the problem still occurs when the containers are not running.
  • RPI hat with Quectel EC25 module
  • Antennae provided with kit.
  • Sixfab SIM
  • Default Sixfab CORE connection, no changes made.

I’m using Netplan to attempt to set a specific WiFi IP address, but this also is not working when CORE is install, I get a different IP address other than the static address I specify. Netplat config file is (with network info redacted):

 This file is generated from information provided by the datasource.  Changes
# to it will not persist across an instance reboot.  To disable cloud-init's
# network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    version: 2
    ethernets:
        eth0:
            dhcp4: true
            optional: true
    wifis:
      wlan0:
        dhcp4: no
        dhcp6: no
        optional: false
        addresses: [192.168.xxx.123/24]
        nameservers:
          addresses: [192.168.xxx.1, 8.8.8.8]
        gateway4: 192.168.xxxx.1
        access-points:
          "network" :
             password : “some_password"
        #routes:
          #- to: 0.0.0.0/0
            #via: 192.168.103.1
            #metric: 300

/etc/os-release:

#cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL=“https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
#uname -a
Linux TWP12 5.15.0-1024-raspi #26-Ubuntu SMP PREEMPT Wed Jan 18 15:29:53 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
# lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 413c:2105 Dell Computer Corp. Model L100 Keyboard
Bus 001 Device 004: ID 2c7c:0125 Quectel Wireless Solutions Co., Ltd. EC25 LTE modem
Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
# lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 3: Dev 4, If 0, Class=Vendor Specific Class, Driver=option, 480M
        |__ Port 3: Dev 4, If 1, Class=Vendor Specific Class, Driver=option, 480M
        |__ Port 3: Dev 4, If 2, Class=Vendor Specific Class, Driver=option, 480M
        |__ Port 3: Dev 4, If 3, Class=Vendor Specific Class, Driver=option, 480M
        |__ Port 3: Dev 4, If 4, Class=Communications, Driver=cdc_ether, 480M
        |__ Port 3: Dev 4, If 5, Class=CDC Data, Driver=cdc_ether, 480M
        |__ Port 4: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
# usb-devices

T:  Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#=  1 Spd=480 MxCh= 1
D:  Ver= 2.00 Cls=09(hub  ) Sub=00 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1d6b ProdID=0002 Rev=05.15
S:  Manufacturer=Linux 5.15.0-1024-raspi xhci-hcd
S:  Product=xHCI Host Controller
S:  SerialNumber=0000:01:00.0
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   4 Ivl=256ms

T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 4
D:  Ver= 2.10 Cls=09(hub  ) Sub=00 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2109 ProdID=3431 Rev=04.20
S:  Product=USB2.0 Hub
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   1 Ivl=256ms

T:  Bus=01 Lev=02 Prnt=02 Port=02 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=0125 Rev=03.18
S:  Manufacturer=Quectel
S:  Product=EG25-G
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 4 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether
E:  Ad=89(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
I:  If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

T:  Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#=  1 Spd=5000 MxCh= 4
D:  Ver= 3.00 Cls=09(hub  ) Sub=00 Prot=03 MxPS= 9 #Cfgs=  1
P:  Vendor=1d6b ProdID=0003 Rev=05.15
S:  Manufacturer=Linux 5.15.0-1024-raspi xhci-hcd
S:  Product=xHCI Host Controller
S:  SerialNumber=0000:01:00.0
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   4 Ivl=256ms

# dmesg | grep tty
[233769.105383] usb 1-1.3: GSM modem (1-port) converter now attached to ttyUSB0
[233769.108675] usb 1-1.3: GSM modem (1-port) converter now attached to ttyUSB1
[233769.109771] usb 1-1.3: GSM modem (1-port) converter now attached to ttyUSB2
[233769.111174] usb 1-1.3: GSM modem (1-port) converter now attached to ttyUSB3
# ls -l /sys/bus/usb-serial/devices
total 0
lrwxrwxrwx 1 root root 0 Feb 13 13:32 ttyUSB0 -> ../../../devices/platform/scb/fd500000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb1/1-1/1-1.3/1-1.3:1.0/ttyUSB0/
lrwxrwxrwx 1 root root 0 Feb 13 13:32 ttyUSB1 -> ../../../devices/platform/scb/fd500000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb1/1-1/1-1.3/1-1.3:1.1/ttyUSB1/
lrwxrwxrwx 1 root root 0 Feb 13 13:32 ttyUSB2 -> ../../../devices/platform/scb/fd500000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb1/1-1/1-1.3/1-1.3:1.2/ttyUSB2/
lrwxrwxrwx 1 root root 0 Feb 13 13:32 ttyUSB3 -> ../../../devices/platform/scb/fd500000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb1/1-1/1-1.3/1-1.3:1.3/ttyUSB3/
# ls -l /dev/serial/by-id
total 0
lrwxrwxrwx 1 root root 13 Feb 13 13:30 usb-Quectel_EG25-G-if00-port0 -> ../../ttyUSB0
lrwxrwxrwx 1 root root 13 Feb 13 13:30 usb-Quectel_EG25-G-if01-port0 -> ../../ttyUSB1
lrwxrwxrwx 1 root root 13 Feb 13 13:30 usb-Quectel_EG25-G-if02-port0 -> ../../ttyUSB2
lrwxrwxrwx 1 root root 13 Feb 13 13:30 usb-Quectel_EG25-G-if03-port0 -> ../../ttyUSB3
# atcom AT
AT
OK


# atcom ATI
ATI
Quectel
EG25
Revision: EG25GGBR07A08M2G

OK
# atcom AT+QGMR
AT+QGMR
EG25GGBR07A08M2G_30.006.30.006

OK

# atcom AT+CPIN?
AT+CPIN?
+CPIN: READY

OK

# atcom AT+CPAS
AT+CPAS
+CPAS: 0

OK

# atcom AT+CFUN?
AT+CFUN?
+CFUN: 1

OK

# atcom AT+COPS?
AT+COPS?
+COPS: 0,0,"AT&T Twilio",7

OK


# atcom AT+QCFG=”usbnet”
AT+QCFG=”usbnet”
ERROR

# atcom AT+QCFG="band"
AT+QCFG=band
ERROR

# atcom AT+CREG?
AT+CREG?
+CREG: 0,5

OK

# atcom AT+CGDCONT?
AT+CGDCONT?
+CGDCONT: 1,"IPV4V6","super","0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0",0,0,0,0
+CGDCONT: 2,"IPV4V6","ims","0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0",0,0,0,0
+CGDCONT: 3,"IPV4V6","SOS","0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0",0,0,0,1

OK

# atcom AT+CSQ
AT+CSQ
+CSQ: 14,99

OK

# atcom AT+CGATT?
AT+CGATT?
+CGATT: 1

OK

# atcom AT+QCFG="nwscanseq"
AT+QCFG=nwscanseq
ERROR

# atcom AT+QCFG="nwscanmode"
AT+QCFG=nwscanmode
ERROR

# atcom AT+QCFG="iotopmode"
AT+QCFG=iotopmode
ERROR


# atcom AT+QCSQ
AT+QCSQ
+QCSQ: "LTE",86,-117,120,-17

OK

# atcom AT+QNWINFO
AT+QNWINFO
+QNWINFO: "FDD LTE","310410","LTE BAND 2",1075

OK

Possible failure related info from journalctl:

Feb 12 19:38:33 TWP12 systemd-networkd[683]: veth493071f: Lost carrier
Feb 12 19:38:33 TWP12 systemd-networkd[683]: veth493071f: DHCPv6 lease lost
Feb 12 19:38:33 TWP12 systemd-networkd[683]: veth077fa16: Lost carrier
Feb 12 19:38:33 TWP12 systemd-networkd[683]: veth077fa16: DHCPv6 lease lost
Feb 12 19:38:34 TWP12 systemd-networkd[683]: veth077fa16: Gained carrier
Feb 12 19:38:34 TWP12 systemd-networkd[683]: veth077fa16: Gained IPv6LL
Feb 12 19:38:44 TWP12 systemd-networkd-wait-online[1465681]: Timeout occurred while waiting for network connectivity.
Feb 12 19:38:44 TWP12 apt-helper[1465675]: E: Sub-process /lib/systemd/systemd-networkd-wait-online returned an error code (1)

Note that the failures occur whether our docker containers are running or not.

Hi,

Thank you for the detailed explanation.
So are you getting an IP on the cellular interface? CORE does not interfere with the systemd-networkd-wait-online service and this service is already inactive by default. The issue you are experiencing appears to be independent of CORE. Maybe disabling that service or adding the “–timeout” parameter will solve the problem. If possible, after installing the initial CORE on a fresh Ubuntu server image and obtaining the cellular interface, you can step-by-step configure the docker container and Netplan configuration to more easily identify the cause of the problem.

Yes I get an IP address on the cell and I’ve verified if I disable WiFi that I get traffic over the cell connection.

I’ve started from scratch 3 times, and the same problem happens every time. Here’s the steps I went through this morning.

Download latest Ubuntu server image for Raspberry PI, LTS 22.04.1. Boot system, at this point, systemd-networkd-wait-online is running:

 systemctl status systemd-networkd-wait-online
● systemd-networkd-wait-online.service - Wait for Network to be Configured
     Loaded: loaded (/lib/systemd/system/systemd-networkd-wait-online.service; enabled; vendor preset: disabled)
     Active: active (exited) since Wed 2023-02-22 10:58:14 UTC; 1min 28s ago
       Docs: man:systemd-networkd-wait-online.service(8)
    Process: 624 ExecStart=/lib/systemd/systemd-networkd-wait-online (code=exited, status=0/SUCCESS)
   Main PID: 624 (code=exited, status=0/SUCCESS)
        CPU: 42ms

Feb 22 10:58:09 ubuntu systemd[1]: Starting Wait for Network to be Configured...
Feb 22 10:58:14 ubuntu systemd-networkd-wait-online[624]: managing: wlan0
Feb 22 10:58:14 ubuntu systemd[1]: Finished Wait for Network to be Configured.

But I don’t think the problem is related to the wait-online service. Here’s the steps I did this morning, after booting the new Ubuntu image, I setup our WiFi/Ip address with Netplan and all is good.

We use Netplan to set a static IP address of 192.168.103.212.

# ping 8.8.8.8          //This is verify WiFi is up and running with no problems:
# apt update
# apt upgrade -y

# reboot

# route 

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.103.1   0.0.0.0         UG    0      0        0 wlan0
0.0.0.0         192.168.103.1   0.0.0.0         UG    600    0        0 wlan0
192.168.103.0   0.0.0.0         255.255.255.0   U     0      0        0 wlan0
192.168.103.1   0.0.0.0         255.255.255.255 UH    600    0        0 wlan0

I create a python virtual environment and install some packages to test the WiFi connection with some python modules.

# pip install -r reqs.txt

And this works fine, all packages install reliably (all 4 times I’ve done this).

Now install SIM core via the SixFab install script in the console window on the RPI4. Connect USB cable to SixFab board, and reboot.

System is slow to boot waiting for the network service to start.

System now has the following IP addresses:
192.168.103.109 on wlan0/WiFi. Note Netplan is still setup for .212
192.168.255.46 on enxf6da71c5d6ba.

Connect dashboard shows device connected, with the following priorities:

Wifi first
Cell second

Route table is:

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.103.1   0.0.0.0         UG    200    0        0 wlan0
0.0.0.0         192.168.225.1   0.0.0.0         UG    1024   0        0 enxf6da71c5d6ba
192.168.103.0   0.0.0.0         255.255.255.0   U     200    0        0 wlan0
192.168.103.1   0.0.0.0         255.255.255.255 UH    200    0        0 wlan0
192.168.225.0   0.0.0.0         255.255.255.0   U     1024   0        0 enxf6da71c5d6ba
192.168.225.1   0.0.0.0         255.255.255.255 UH    1024   0        0 enxf6da71c5d6ba

So generically this looks OK, but why has the IP address changed from the specified .212 to .109?

If I run, which completes with no errors:

# netplan —debug apply

The IP address remains at .109, not the required .212

From here I run the SixFab uninstall script, unplug the USB cable, and reboot.

Still get a delay on boot due to the wait online service.

IP address is still .109, not .212 as required.

This is all generic Ubuntu on RPI - latest OS with proper updates, etc.

Any ideas or suggestions are appreciated. I believe this configuration should work given this is generic Ubuntu. We have the capability to support multiple network interfaces, and have no problems with those on other systems (this system only has the SixFab board).

And I’ve not been able to recover from this once the SixFab has been installed. Only way to recover is to start over with a new Ubuntu image.

Any updates or ideas on this? Doesn’t seem like this should fail consistently on stock Ubuntu. If we can not reliably assign an IP address and reliably route traffic over a given interface, this will not work for us (and probably many others, I suspect).

Unfortunately, I do not have any suggestions regarding assigning an IP address with netplan, but you may try uninstalling CORE and following the ECM tutorial. In fact, there doesn’t seem to be any issue with your cellular connection. As this tutorial service is independent of any services, if a problem is caused by a service (which doesn’t seem to be the case), it can be fixed.

Your statement that “there doesn’t seem to be any issue with your cellular connection” is not correct, refer back to my original post where I document the actual problem - a failed software install where WiFi is primary device and cell is secondary. I believe this is due to something wrong with the routing tables.

You responded to my original request by stating that the network wait systemd service was not enabled by default, which I knew was wrong. So proved that it is enabled by default by using the standard Ubuntu server distro out of the box. Unrelated to the actual problem, but perhaps a symptom, since we don’t have trouble with other modems.

So the question remains. With WiFi as primary with an Internet connection, and the Sixfab cell as secondary, why would a pip install fail, such as:

# pip install -r reqs.txt

Note that some pip modules install OK, others fail. As as a reminder, the pip install works fine when the Sixfab software in not installed. Smells like a route table problem to me, but I don’t really see any issues with the route tables.

Any ideas are appreciated.