Builders
Node Operators
Node Management
Troubleshooting

Node Troubleshooting

This page lists common troubleshooting scenarios and solutions for node operators.

401 Unauthorized: Signature Invalid

If you see a log that looks like this in pnc--batcher:

WARN [12-13|15:53:20.263] Derivation process temporary error       attempts=80 err="stage 0 failed resetting: temp: failed to find the L2 Heads to start from: failed to fetch current L2 forkchoice state: failed to find the finalized L2 block: failed to determine L2BlockRef of finalized, could not get payload: 401 Unauthorized: signature is invalid

It means that the pnc--batcher is unable to authenticate with pnc-geth's authenticated RPC using the JWT secret.

Solution

  1. Check that the JWT secret is correct in both services.
  2. Check that pnc-geth's authenticated RPC is enabled, and that the URL is correct.

403 Forbidden: Invalid Host Specified

If you see a log that looks like this in pnc--batcher:

{"err":"403 Forbidden: invalid host specified\n","lvl":"error","msg":"error getting latest header","t":"2022-12-13T22:29:18.932833159Z"}

It means that you have not whitelisted pnc--batcher's host with pnc-geth.

Solution

  1. Make sure that the --authrpc.vhosts parameter in pnc-geth is either set to the correct host, or *.
  2. Check that pnc-geth's authenticated RPC is enabled, and that the URL is correct.

Failed to Load P2P Config

If you see a log that looks like this in pnc--batcher:

CRIT [12-13|13:46:21.386] Application failed                       message="failed to load p2p config: failed to load p2p discovery options: failed to open discovery db: mkdir /p2p: permission denied"

It means that the pnc--batcher lacks write access to the P2P discovery or peerstore directories.

Solution

  1. Make sure that the pnc--batcher has write access to the P2P directory. By default, this is /p2p.
  2. Set the P2P directory to somewhere the pnc--batcher can access via the --p2p.discovery.path and --p2p.peerstore.path parameters.
  3. Set the discovery path to memory to disable persistence via the --p2p.discovery.path and --p2p.peerstore.path parameters.

Wrong Chain

If you see a log that looks like this in pnc--batcher:

{"attempts":183,"err":"stage 0 failed resetting: temp: failed to find the L2 Heads to start from: wrong chain L1: genesis: 0x4104895a540d87127ff11eef0d51d8f63ce00a6fc211db751a45a4b3a61a9c83:8106656, got 0x12e2c18a3ac50f74d3dd3c0ed7cb751cc924c2985de3dfed44080e683954f1dd:8106656","lvl":"warn","msg":"Derivation process temporary error","t":"2022-12-13T23:31:37.855253213Z"}

It means that the pnc--batcher is pointing to the wrong chain.

Solution

  1. Verify that the pnc--batcher's L1 URL is pointing to the correct L1 for the given network.
  2. Verify that the pnc--batcher's rollup config/--network parameter is set to the correct network.
  3. Verify that the pnc--batcher's L2 URL is pointing to the correct instance of pnc-geth, and that pnc-geth is properly initialized for the given network.

Unclean Shutdowns

If you see a log that looks like this in pnc-geth:

WARN [03-05|16:18:11.238] Unclean shutdown detected                booted=2023-03-05T11:09:26+0000 age=5h8m45s

It means pnc-geth has experienced an unclean shutdown. The geth docs (opens in a new tab) say if Geth stops unexpectedly, the database can be corrupted. This is known as an "unclean shutdown" and it can lead to a variety of problems for the node when it is restarted.

It is always best to shut down Geth gracefully, i.e. using a shutdown command such as ctrl-c, docker stop -t 300 <container ID> or systemctl stop (although please note that systemctl stop has a default timeout of 90s - if Geth takes longer than this to gracefully shut down it will quit forcefully. Update the TimeoutSecs variable in systemd.service to override this value to something larger, at least 300s).

This way, Geth knows to write all relevant information into the database to allow the node to restart properly later. This can involve >1GB of information being written to the LevelDB database which can take several minutes.

Solution

If an unexpected shutdown does occur, the removedb subcommand can be used to delete the state database and resync it from the ancient database. This should get the database back up and running.