I use the apache http client library (4.5.14). I'm doing a request which receives a 302 along with a Location response header.
The Location response header is URL encoded, but somehow the apache http client library seems to URL decode it and attempts to make a request to the now URL decoded url.
This is the Location header in the response :
2023-10-31 07:25:38.899 DEBUG --- [ main] org.apache.http.headers : http-outgoing-2 << Location: https://[hostname-redacted]/temp/AB-C0244_BONNIE_BANKS_O_LOCH_LOMOND16987335293607/AB-C0244_TK22.2_BONNIE_BANKS_O%27_LOCH_LOMOND_General_Underscore_Abaco_Music_Library_(PRS).wav?Expires=1698733651&Signature=ZkxxNJhbLrX84ykFuFA7pKEhc05pCrUGJ8oQNjUbCswvh5o8K2k7W3HlG16zrX4EO2co4M1074X2uoHAlpTjnqhSz6FSnWeFud0LIEU6z99nvYmll6bgj1VQg~FqFz2aGLHF9qzRIeM2e73gWBU8bu5EVg5X5n-m8gdyz6-zg5M_&Key-Pair-Id=APKAIBH6A7AA3GUADXUQ
This is the URL which the apache http client then attempts to request :
2023-10-31 07:25:38.902 DEBUG --- [ main] o.a.http.impl.execchain.MainClientExec : Executing request GET /temp/AB-C0244_BONNIE_BANKS_O_LOCH_LOMOND16987335293607/AB-C0244_TK22.2_BONNIE_BANKS_O'LOCH_LOMOND_General_Underscore_Abaco_Music_Library(PRS).wav?Expires=1698733651&Signature=ZkxxNJhbLrX84ykFuFA7pKEhc05pCrUGJ8oQNjUbCswvh5o8K2k7W3HlG16zrX4EO2co4M1074X2uoHAlpTjnqhSz6FSnWeFud0LIEU6z99nvYmll6bgj1VQg~FqFz2aGLHF9qzRIeM2e73gWBU8bu5EVg5X5n-m8gdyz6-zg5M_&Key-Pair-Id=APKAIBH6A7AA3GUADXUQ HTTP/1.1
I've tried using a custom redirect strategy, but it doesn't help. Somehow the URL seems to be URL decoded after any RedirectStrategy has handed off the URI.
Is there a way for me to hook into a request, to ensure that the request url is URL encoded?
Update
Having investigated this one even further, it seems that DefaultRedirectStrategy and getLocationURI is to blame.
At some point, it does :
if (config.isNormalizeUri()) {
uri = URIUtils.normalizeSyntax(uri);
}
From what I can read, this should (among other things) decode percent-encoded triplets of unreserved characters. However, in my case it also decodes %27 - which seems to be wrong.
UPDATE 2 : code which demonstrates the issue
Note! To see what's going on here, you need to enable debug logs for org.apache.http. You also need the following dependency in a pom.xml.
In the debug logs, you will see that the redirect to /test%27in will be attempted as /test'in.
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.14</version>
</dependency>
package test;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;
import java.io.IOException;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
public class ApacheTest {
public static void main(String[] args) throws IOException, InterruptedException {
HttpServer server = HttpServer.create(new InetSocketAddress(8000), 0);
server.createContext("/test", new RedirectHttpHandler());
server.createContext("/test'in", new SuccessHttpHandler());
server.setExecutor(null);
server.start();
String uri = "http://127.0.0.1:8000/test";
ApacheTest t = new ApacheTest();
t.doRequest(uri);
server.stop(0);
}
private CloseableHttpClient getClient() {
RequestConfig requestConfig =
RequestConfig.custom().setNormalizeUri(true).build();
CloseableHttpClient httpClient =
HttpClients.custom().setDefaultRequestConfig(requestConfig).build();
return httpClient;
}
private void doRequest(String uri) {
try {
CloseableHttpClient c = getClient();
HttpGet get = new HttpGet(uri);
CloseableHttpResponse response = c.execute(get);
if (response.getStatusLine().getStatusCode() != 200) {
throw new Exception("Expected 200 OK");
}
System.out.println("Got 200 OK!");
} catch (Exception e) {
e.printStackTrace();
}
}
public static class RedirectHttpHandler implements HttpHandler {
@Override
public void handle(HttpExchange t) throws IOException {
t.getResponseHeaders().add("Location", "/test%27in");
t.sendResponseHeaders(302, 0);
OutputStream os = t.getResponseBody();
os.close();
}
}
public static class SuccessHttpHandler implements HttpHandler {
@Override
public void handle(HttpExchange t) throws IOException {
String response = "This is the response";
t.sendResponseHeaders(200, response.length());
OutputStream os = t.getResponseBody();
os.write(response.getBytes());
os.close();
}
}
}
There is a workaround that is specific to the problem on hand, but it may be possible to make a general solution out of it.
Create a subclass of
DefaultRedirectStrategyclass and override thecreateLocation( String )method. This method receives the redirect URL, which you can massage and create aURIout of. Here, we can encode%character, so that the "encoded triplet" (using your wording) is changed and27is retained. So, when the actual decoding happens, we get back%27.Like this:
Now, use this in the
HttpClientBuilderthus: