jsoup1.15.4奇怪bug排查

官方团队很快就修复了

https://github.com/jhy/jsoup/issues/1902


jsoup 是目前个人用过的简单易用轻量的很好的一个工具,特别是解析网页方面,爬虫、模拟登录等

目前有个项目也有用到,我是比较喜欢用新的版本的,所以当时就用了 1.15.3 最新,一直相安无事

昨天看见 pom.xml 报黄,于是升级至 1.15.4,好多 API 请求都正常

这里插句题外话,JDK 在源码级别不支持 PATCH 方法,比如 java.net.HttpURLConnection

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
package java.net;
...
public abstract class HttpURLConnection extends URLConnection {
...
/* valid HTTP methods */
private static final String[] methods = {
"GET", "POST", "HEAD", "OPTIONS", "PUT", "DELETE", "TRACE"
};
...
/**
* Set the method for the URL request, one of:
* <UL>
* <LI>GET
* <LI>POST
* <LI>HEAD
* <LI>OPTIONS
* <LI>PUT
* <LI>DELETE
* <LI>TRACE
* </UL> are legal, subject to protocol restrictions. The default
* method is GET.
*
* @param method the HTTP method
* @throws ProtocolException if the method cannot be reset or if
* the requested method isn't valid for HTTP.
* @throws SecurityException if a security manager is set and the
* method is "TRACE", but the "allowHttpTrace"
* NetPermission is not granted.
* @see #getRequestMethod()
*/
public void setRequestMethod(String method) throws ProtocolException {
if (connected) {
throw new ProtocolException("Can't reset method: already connected");
}
// This restriction will prevent people from using this class to
// experiment w/ new HTTP methods using java. But it should
// be placed for security - the request String could be
// arbitrarily long.

for (int i = 0; i < methods.length; i++) {
if (methods[i].equals(method)) {
if (method.equals("TRACE")) {
@SuppressWarnings("removal")
SecurityManager s = System.getSecurityManager();
if (s != null) {
s.checkPermission(new NetPermission("allowHttpTrace"));
}
}
this.method = method;
return;
}
}
throw new ProtocolException("Invalid HTTP method: " + method);
}

}

试过 X-Method-Override 也不好使,于是部分 PATCH 的接口改用 okhttp3


今天重新测试的时候,发现登录接口不好使了,因为登录是一条 Auth0 的链接,https://auth0.xxx.com/authorize?client_id=xxx&scope=openid%20email%20profile,scope 里面的值是有空格分开的

‘%’ 也会被转码

在 1.15.3 里面正常,在 1.15.4 里会把 %20 里面的 % 也重新编码,导致出现 %2520,服务器就报错了

获取不到 state,无法登录系统

回滚后一切正常