Let me start with, that I am running Raspberry Pi servers since the first Raspberry Pi was released more than a decade ago. Only problems I ever had until now, where dying SD-Cards with the first generation of RPIs. Since them I only buy really big high quality SD-Cards and I have RPI(4) servers running 24/7 for years w/o any troubles.

For a new project, I am running a web service on a Raspberry Pi ZeroW2 with an Apache reverse proxy on the same machine. Memory usage, even under load, is a maximum of 100 MB. This RPIZW2 simply dies after a few days, and I have no idea how debug this problem.

More details of the RPIZW2:

  • Uses Raspbian configured via Ansible to be an exact replica of my RPI(4), only Apache and a webservice were added
  • Quality power supply (original RPI hardware) and literally plugged to the same electricity circuit as the RPI(4)
  • The webapp is just a ‘hello, world’ with the current time and my internet connection is not fast enough to be DOSed
  • Monitored memory usage etc. for several hours and found nothing out of the ordinary
  • fail2bann is active and running
  • SD-card has several unused GIGs of free space and is same brand/quality as the one in the RPI(4)

Anyone experienced something similar? Has anyone an idea how to approach debugging this problem?

I am not sure that there is a better place at Lemmy for this kind of question than here. I’ll happily move this post to another place, if it is not appropriate here.

  • czardestructo@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    1 day ago

    I’ve owned and deployed a lot of pi, every model, and in my experience when I have similar instability as you described its related to the sdcard. Either the sdcard itself or the tray soldered to the pi. I had one pi that would corrupt the sdcard without fail after 2 months and I played with bending the sdcard metal tray inward a little to help press the card better into the contacts and the problem went away. Try fiddling with the sdcard holder or different sdcards.

    • wolf@lemmy.zipOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 days ago

      dmesg and journalctl -k, found only entries after reboot, that the shutdown was not clean. Any specific logs where I could find more?

        • wolf@lemmy.zipOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          Thanks. I could neither find a file /var/log/kern.log nor did find /var/log | grep -i mess have a match.

  • Machinist@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 days ago

    I have a pi 4 server that hangs up after a few days. Tried to track down what was causing it but didn’t go to extremes. Most expedient solution was to force a reboot.

    Have a cron job that reboots the pi at 3:00 AM each day. Easy work around, lazy was better.

  • Eggymatrix@sh.itjust.works
    link
    fedilink
    arrow-up
    2
    arrow-down
    1
    ·
    3 days ago

    The issue is that it is simply not built with reliability as a high priority so probably some hardware component shits itself too much after a while.There is a reason every reasonable company that needs a server to run reliably in production uses something orders of magnitude more expensive than a rpi.

    You lucked out with your previous experiences, but many others did not, or the industry would not pay the price of a rpi a month to run a machine with the specs of a rpi.

    That said, if you don’t need the reliability some easy hacks like a reboot cronjob or systemd timer, or trying to turn off unneeded services or peripherals could give you 90% of an industrial server’s reliability