Elevated badNonce and malformed errors when trying to issue certificates #17

已关闭
由 Ghost 于 2019-02-21 05:00:16 +00:00 打开 · 10 评论

I'm seeing lots of intermittent errors while issuing certificates since I upgraded my service to greenlock from letsencrypt. About 90% of the errors are badNonce, and the rest are malformed. Retrying a few times usually results in getting a certificate issued.

Looks like this is related to #7.

We are on node v10.15.0 using

  • Greenlock v2.6.7
  • acme-v2 v1.5.2
  • rsa-compat v1.9.2

All of them are the latest versions, except rsa-compat, but looking at the changes, don't see anything that would fix the problem by bumping to v2.0.2. In any case esa-compat@1.9.2 is a dependency of acme-v2@1.5.2, so can't really bump it till there is a new acme-v2 release.

badNonce error trace -

Error: [acme-v2.js] authorizations were not fetched for 'doc.mail.freenet.de':
{"type":"urn:ietf:params:acme:error:badNonce","detail":"JWS has an invalid anti-replay nonce: \"{{nonce}}\"","status":400}
    at /var/www-api/node_modules/acme-v2/node.js:620:33
    at process._tickCallback (internal/process/next_tick.js:68:7)

malformed error trace -

Didn't finalize order: Unhandled status '400'. This is not one of the known statuses...
Requested: '{{domain}}'
Validated: '{{domain}}'
{
  "type": "urn:ietf:params:acme:error:malformed",
  "detail": "Order's status (\"processing\") is not acceptable for finalization",
  "status": 400
}

Please open an issue at https://git.coolaj86.com/coolaj86/acme-v2.js
I'm seeing lots of intermittent errors while issuing certificates since I upgraded my service to greenlock from letsencrypt. About 90% of the errors are `badNonce`, and the rest are `malformed`. Retrying a few times usually results in getting a certificate issued. Looks like this is related to #7. We are on node v10.15.0 using - Greenlock v2.6.7 - acme-v2 v1.5.2 - rsa-compat v1.9.2 All of them are the latest versions, except `rsa-compat`, but looking at the changes, don't see anything that would fix the problem by bumping to v2.0.2. In any case esa-compat@1.9.2 is a dependency of acme-v2@1.5.2, so can't really bump it till there is a new `acme-v2` release. `badNonce` error trace - ``` Error: [acme-v2.js] authorizations were not fetched for 'doc.mail.freenet.de': {"type":"urn:ietf:params:acme:error:badNonce","detail":"JWS has an invalid anti-replay nonce: \"{{nonce}}\"","status":400} at /var/www-api/node_modules/acme-v2/node.js:620:33 at process._tickCallback (internal/process/next_tick.js:68:7) ``` `malformed` error trace - ``` Didn't finalize order: Unhandled status '400'. This is not one of the known statuses... Requested: '{{domain}}' Validated: '{{domain}}' { "type": "urn:ietf:params:acme:error:malformed", "detail": "Order's status (\"processing\") is not acceptable for finalization", "status": 400 } Please open an issue at https://git.coolaj86.com/coolaj86/acme-v2.js ```
作者

Hi,

Is there anything I can do to help debug this?

Hi, Is there anything I can do to help debug this?
作者

According to the Lets Encrypt folks, one of the reasons for getting a badNonce error is either requests are made from a different IP, or the SSL connection is not reused -

dehydrated is a shell script that runs a separate curl command for each request, which means each request is made on a new connection, which means you get assigned a new source IP address. Most likely a client that reuses an HTTPS connection over multiple requests would have much less trouble. That rules out shell-based clients, but other clients should work reasonably well.

Looking at the code for urequest, it looks like it uses the default https global agent, and does not set keepAlive to true. I tried to set the default to true, but that didn't seem to help.

@coolaj86 any ideas on what I might be missing?

According to the Lets Encrypt folks, one of the reasons for getting a badNonce error is either requests are made from a different IP, [or the SSL connection is not reused](https://community.letsencrypt.org/t/jws-has-an-invalid-anti-replay-nonce-when-client-behind-nat/66493/3?u=elssar) - >dehydrated is a shell script that runs a separate curl command for each request, **which means each request is made on a new connection, which means you get assigned a new source IP address**. Most likely a client that reuses an HTTPS connection over multiple requests would have much less trouble. That rules out shell-based clients, but other clients should work reasonably well. Looking at the code for `urequest`, it looks like it uses the default https global agent, and does not set keepAlive to `true`. I tried to set the default to `true`, but that didn't seem to help. @coolaj86 any ideas on what I might be missing?
管理员

@elssar Thanks so much for this!

That makes a lot of sense, but I never would have guessed.

I'll take a look at at urequest and make sure that

  1. agent options can passed in
  2. the correct agent options are used by acme-v2
@elssar Thanks so much for this! That makes a lot of sense, but I never would have guessed. I'll take a look at at urequest and make sure that 1. agent options can passed in 2. the correct agent options are used by acme-v2
管理员

Node.js maintains several connections per server to make HTTP requests. This function allows one to transparently issue requests.

That may explain why even the default agent is having this issue.

I'm about to push a change to urequest that will allow agent to be passed in. I think that creating an agent with only one request per server may help. We shall see.

> Node.js maintains several connections per server to make HTTP requests. This function allows one to transparently issue requests. That may explain why even the default agent is having this issue. I'm about to push a change to urequest that will allow `agent` to be passed in. I think that creating an agent with only one request per server may help. We shall see.
管理员

I've been doing a number of things tonight. I did modify urequest@1.3.7 to respect agent when passed in. I'll try to test the rest tomorrow, but feel free to try it and let me know.

Also, I don't get these errors myself. If you come up with a way to reproduce it that would be awesome.

I've been doing a number of things tonight. I did modify `urequest@1.3.7` to respect `agent` when passed in. I'll try to test the rest tomorrow, but feel free to try it and let me know. Also, I don't get these errors myself. If you come up with a way to reproduce it that would be awesome.
作者

@coolaj86 I haven't been able to reproduce this outside of my production environment, but I think I have an idea how to do it. Will let you know I am successful.

@coolaj86 I haven't been able to reproduce this outside of my production environment, but I think I have an idea how to do it. Will let you know I am successful.
管理员

Any updates?

I'd be interested to know if this still happens in Greenlock v2.7+.

I made a bunch of updates, added a bunch of tests, and got some corner cases taken care of.

I don't think anything that I did would directly affect this but, aside from the backwards compatibility shims which became more complex, the core flow of the code is a lot simpler now.

Any updates? I'd be interested to know if this still happens in Greenlock v2.7+. I made a bunch of updates, added a bunch of tests, and got some corner cases taken care of. I don't think anything that I did would directly affect this but, aside from the backwards compatibility shims which became more complex, the core flow of the code is a lot simpler now.
作者

@coolaj86, sorry I couldn't reproduce this in my beta environment, and since this was causing problems in production we moved back to the letsencrypt module. Will revisit this in the next couple of weeks.

@coolaj86, sorry I couldn't reproduce this in my beta environment, and since this was causing problems in production we moved back to the letsencrypt module. Will revisit this in the next couple of weeks.
管理员

We now offer business support at an hourly rate, so if you're interested in pairing together for an hour to look at specifics send a message to aj@therootcompany.com and we can schedule some time to take a look.

We now offer business support at an hourly rate, so if you're interested in pairing together for an hour to look at specifics send a message to aj@therootcompany.com and we can schedule some time to take a look.
管理员

Fixed in v3

Fixed in v3
coolaj862020-04-21 07:55:54 +00:00 关闭此工单
登录 并参与到对话中。
未选择标签
2 名参与者
通知
到期时间
未设置到期时间。
依赖工单

没有设置依赖项。

参考:coolaj86/acme.js-ARCHIVED#17
没有提供说明。