Skip to Content [alt-c]

November 29, 2012

Beware the IPv6 DAD Race Condition

One extremely frustrating problem that arises with IPv6 under Linux is the race condition caused at system boot by IPv6's Duplicate Address Detection (DAD). DAD is a new feature in IPv6 that protects against IP address conflicts. The way it works is that after an address is added to an interface, the operating system uses the Neighbor Discovery Protocol to check if any other host on the network has the same address. If it finds a neighbor with the same address, the address is removed from the interface.

The problem is that until DAD can confirm that there is no other host with the same address, the address is considered to be "tentative." While it is in this state, attempts to bind() to the address fail with EADDRNOTAVAIL, as if the address doesn't exist. That means that if you have a service configured to listen on a particular IPv6 address, and that IPv6 address is still tentative when the service starts, it will fail to bind to that address. Very few programs will try to bind again later. Most either continue without listening on the failed address, or fail to start altogether. Apache, for example, fails to start if it can't bind to an address.

DAD is fast, but not always fast enough. Since services like Apache are started soon after networking is configured on system boot, there is a race condition. Sometimes your critical services start on boot, and sometimes they don't! This is clearly not acceptable behavior for a production server.

For this reason, I always disable DAD on servers that use IPv6. When DAD is disabled, addresses are immediately usable, just like they are with IPv4. Without DAD, I have to trust myself not to shoot myself in the foot with an address conflict, but that's nothing new. Besides, most of these servers are in data centers that restrict the IP addresses on the switch port anyways.

To disable DAD, you need to write 0 to /proc/sys/net/ipv6/conf/ethX/accept_dad, where ethX is the interface, before you configure the interface. In Debian, I accomplish this using a pre-up directive on the interface stanza, like this:

iface eth0 inet6 static address 3ffe:ffff::4a:5000 netmask 64 gateway fe80::1 pre-up echo 0 > /proc/sys/net/ipv6/conf/eth0/accept_dad

Clearly, this is less than ideal. To begin with, this problem is not well-documented, which will cause endless frustration to administrators trying to roll out IPv6. Even then, this solution is sub-optimal and would lead to the demise of DAD on servers if widely implemented. But a better solution would require more fundamental changes to either the operating system or to the applications. The operating system could pause on boot until DAD completes. Ideally, it would know which services listen on which addresses, and delay the start of only those services. Services could retry failed binds after a delay, or set the Linux-specific IP_FREEBIND socket option, which permits binding to a non-local or non-existent address. (Unfortunately, IP_FREEBIND would also let you bind to an address that truly doesn't exist.)

As IPv6 becomes more widespread, I expect this to be addressed in earnest, but until then, disabling DAD is the way to go.

Update (2013-05-05): A bug report has been filed in Debian about this issue. It proposes that ifupdown's init script not return until DAD completes on all interfaces.

Comments

Reader Balakumaran on 2014-01-03 at 06:15:

Nice article... Helped me not to re-invent the wheel

Reply

Reader Rishabh on 2015-01-08 at 09:02:

I think dad_transmits is the correct option , accept_dad - not sure.

Reply

Andrew Ayer on 2015-01-08 at 16:51:

dad_transmits might also work, but I've been using accept_dad on dozens of servers for 3+ years and it causes addresses to immediately become non-tentative, so I'm confident it works.

Reply

Reader Mihai Moldovan on 2015-01-11 at 16:07:

Linux now has an Optimistic DAD implementation, c.f. https://lwn.net/Articles/218597/ and RFC 4429 at, for instance, https://www.rfc-editor.org/rfc/rfc4429.txt

This option could potentially fix your problems without the need to disable DAD completely.

Reply

Andrew Ayer on 2015-01-11 at 16:31:

Thank you. This is extremely useful information. On Linux, optimistic DAD is controlled by the net.ipv6.conf.*.optimistic_dad sysctl (which is undocumented, which is why I did not know about it before now). I'll be testing this out and will update the blog post with my findings.

Reply

Reader Maarten on 2016-06-07 at 21:34:

Interested to know how optimistic DAD worked out for you. I'm not sure why, but I'm still getting services failing to start at boot with optimistic DAD.

In specific, on a VPS back by SSD storage, running OpenSMTPD on Arch Linux, I'm still getting errors regarding the bind address being unavailable.

Weirdly enough, adding sleep times (in the order of 5 seconds) before service start or restarting the service after a time-out did not help (not that they would be acceptable `solutions'), but disabling DAD did. I suspect there is more going on that I don't know about, because I can't explain these observations.

Reply

Reader Joshua Johnson on 2015-04-23 at 17:42:

Andrew, thanks much for this article and Mihai, thanks for the info on optimistic DAD. Very useful.

Reply

Reader James Johnston on 2016-04-09 at 03:42:

Thanks for the tip. This issue apparently still exists on Ubuntu 15.10, and this page helped me work around it. I hope the Ubuntu team makes IPv6 support a priority soon... DHCPv6 not working out of the box counts as "badly broken" in my book. (Maybe the issue also exists in Debian still. I have not tested.)

Reply

Reader Mark on 2016-09-01 at 23:21:

Here's how I tackle this using zsh without having to disable duplicate address detection:

ip -6 addr add 2001:DB8:1234:f00::2/64 dev eth0 wait_dad=0 until (( ++wait_dad > 100 )) { grep -q 2001:DB8:1234:f00::1 <(ip -6 r l table local) && no_da=1 && break sleep 0.2 } (( no_da )) && ip -6 r a 2001:DB8::/32 via 2001:DB8:1234:f00::ffff dev eth0 src 2001:DB8:1234:f00::2

The trick is; once dad detection has finished and the address is safe to use (no one uses it on the network already), the address is added to the local ipv6 route table. So, we wait max 20 seconds for it (should never be reached) and if we find the address in the local table, we can add the route.

The code should be easy enough for people who don't use zsh to read and adapt for their preferred shell.

Hope this helps others.

Reply

Reader Kenny on 2017-09-14 at 20:04:

Thanks Andrew! Followed the article and was able to resolve an issue and I'm not a linux admin.

Reply

Reader Günther on 2017-12-30 at 17:49:

Thank you Andrew, this still is an issue with Ubuntu LTS 16.04... (not the most recent I know, but a LTS version and therefor current until after April 2018 when version 18.04 will be released)

Reply

Post a Comment

Your comment will be public. To contact me privately, email me. Please keep your comment polite, on-topic, and comprehensible. Your comment may be held for moderation before being published.

(Optional; will be published)

(Optional; will not be published)

(Optional; will be published)

  • Blank lines separate paragraphs.
  • Lines starting with > are indented as block quotes.
  • Lines starting with two spaces are reproduced verbatim (good for code).
  • Text surrounded by *asterisks* is italicized.
  • Text surrounded by `back ticks` is monospaced.
  • URLs are turned into links.
  • Use the Preview button to check your formatting.