DnsNameResolverTimeoutException in Spring Boot WebFlux
Posted on June 17, 2024
A few weeks ago, I started noticing errors during the Polidict authentication flow. The application was throwing a DnsNameResolverTimeoutException
when making network requests. The error messages appeared as follows:
io.netty.resolver.dns.DnsNameResolverTimeoutException: [10812: /[fdaa:0:0:0:0:0:0:3]:53] DefaultDnsQuestion(www.googleapis.com. IN AAAA) query '10812' via UDP timed out after 5000 milliseconds (no stack trace available)
Caused by: java.net.UnknownHostException: Failed to resolve 'www.googleapis.com' [A(1), AAAA(28)] after 2 queries
Problem
Initially, I suspected a network issue and criticized fly.io for it on Twitter.
To resolve the issue, I attempted several approaches:
- Changed deployment regions multiple times
- Reverted to the previous version of Spring Boot
- Tuned timeouts in the application properties
However, none of these solutions remedied the problem.
Further investigation revealed that the issue was tied to the DNS resolver used by the application. I found related issues on GitHub:
The problem seemed to originate from Netty's DnsAddressResolverGroup
.
Potential resolutions include:
- Customize the Netty DNS resolver configuration to handle DNS responses more effectively by adjusting timeouts, retries, or other settings.
- Switch to a different DNS resolver library that may better manage large DNS responses.
- Implement a custom DNS resolver.
- Investigate the cause of large DNS responses with your DNS provider to optimize or stabilize the DNS infrastructure.
My Solution
I implemented a global WebClientCustomizer
that switches the DNS resolver to DefaultAddressResolverGroup
.
@Component
public class AddressResolverWebClientCustomizer implements WebClientCustomizer {
@Override
public void customize(WebClient.Builder webClientBuilder) {
webClientBuilder.clientConnector(
new ReactorClientHttpConnector(HttpClient.create()
.resolver(DefaultAddressResolverGroup.INSTANCE)));
}
}
However, this wasn't sufficient. The issue persisted during the OAuth2 login flow. The WebClientReactiveAuthorizationCodeTokenResponseClient
uses the default WebClient.
To address this, I created a custom WebClientReactiveAuthorizationCodeTokenResponseClient
using a customized WebClient.
/**
* Client with customized WebClient. By default, it uses WebClient.builder().build() which is not customized.
* Refer to {@link AddressResolverWebClientCustomizer} for customization details.
*/
@Bean
public WebClientReactiveAuthorizationCodeTokenResponseClient webClientReactiveAuthorizationCodeTokenResponseClient(
WebClient.Builder webClientBuilder
) {
var webClientReactiveAuthorizationCodeTokenResponseClient = new WebClientReactiveAuthorizationCodeTokenResponseClient();
webClientReactiveAuthorizationCodeTokenResponseClient.setWebClient(webClientBuilder.build());
return webClientReactiveAuthorizationCodeTokenResponseClient;
}
The issue is fixed now. I'm still monitoring the application to see if the issue will happen again.
Improvements
For now I changed resolver for all instances of WebClient provided by Spring Boot. It would be better to change resolver only for the WebClient that is used for OAuth2 login flow.
To do this I will just move custom resolver creation directly to webClientReactiveAuthorizationCodeTokenResponseClient
bean declaration.
Conclusion
If you encounter a DnsNameResolverTimeoutException
in your Spring Boot application, it's likely due to the Netty DNS resolver. Customizing the DNS resolver configuration, using a different DNS resolver, or implementing a custom resolver may solve the problem.