网站优化什么意思,互联网行业信息网站,2023购物平台排行榜,从化电子商务网站建设Jsoup是一个非常好的解析网页的包#xff0c;用java开发的#xff0c;提供了类似DOM#xff0c;CSS选择器的方式来查找和提取文档中的内容。相关资料如下#xff1a;今天做了一个Jsoup解析网站的项目#xff0c;使用Jsoup.connect(url).get()连接某网站时偶尔会出现java.n…Jsoup是一个非常好的解析网页的包用java开发的提供了类似DOMCSS选择器的方式来查找和提取文档中的内容。相关资料如下今天做了一个Jsoup解析网站的项目使用Jsoup.connect(url).get()连接某网站时偶尔会出现java.net.SocketTimeoutException:Read timed out异常。原因是默认的Socket的延时比较短而有些网站的响应速度比较慢所以会发生超时的情况。解决方法链接的时候设定超时时间即可。doc Jsoup.connect(url).timeout(5000).get();5000表示延时时间设置为5s。测试代码如下1不设定timeout时package jsoupTest;import java.io.IOException;import org.jsoup.*;import org.jsoup.helper.Validate;import org.jsoup.nodes.Document;import org.jsoup.nodes.Element;import org.jsoup.select.Elements;public class JsoupTest {public static void main(String[] args) throws IOException{String url http://www.weather.com.cn/weather/101010400.shtml;long start System.currentTimeMillis();Document docnull;try{doc Jsoup.connect(url).get();}catch(Exception e){e.printStackTrace();}finally{System.out.println(Time is:(System.currentTimeMillis()-start) ms);}Elements elem doc.getElementsByTag(Title);System.out.println(Title is: elem.text());}}有时发生超时java.net.SocketTimeoutException: Read timed outat java.net.SocketInputStream.socketRead0(Native Method)at java.net.SocketInputStream.read(Unknown Source)at java.net.SocketInputStream.read(Unknown Source)at java.io.BufferedInputStream.fill(Unknown Source)at java.io.BufferedInputStream.read1(Unknown Source)at java.io.BufferedInputStream.read(Unknown Source)at sun.net.www.http.ChunkedInputStream.fastRead(Unknown Source)at sun.net.www.http.ChunkedInputStream.read(Unknown Source)at java.io.FilterInputStream.read(Unknown Source)at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(Unknown Source)at java.util.zip.InflaterInputStream.fill(Unknown Source)at java.util.zip.InflaterInputStream.read(Unknown Source)at java.util.zip.GZIPInputStream.read(Unknown Source)at java.io.BufferedInputStream.read1(Unknown Source)at java.io.BufferedInputStream.read(Unknown Source)at java.io.FilterInputStream.read(Unknown Source)at org.jsoup.helper.DataUtil.readToByteBuffer(DataUtil.java:113)at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:447)at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:393)at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:159)at org.jsoup.helper.HttpConnection.get(HttpConnection.java:148)at jsoupTest.JsoupTest.main(JsoupTest.java:17)Time is:3885msException in thread main java.lang.NullPointerExceptionat jsoupTest.JsoupTest.main(JsoupTest.java:25)2,设定了则一般不会超时package jsoupTest;import java.io.IOException;import org.jsoup.*;import org.jsoup.helper.Validate;import org.jsoup.nodes.Document;import org.jsoup.nodes.Element;import org.jsoup.select.Elements;public class JsoupTest {public static void main(String[] args) throws IOException{String url http://www.weather.com.cn/weather/101010400.shtml;long start System.currentTimeMillis();Document docnull;try{doc Jsoup.connect(url).timeout(5000).get();}catch(Exception e){e.printStackTrace();}finally{System.out.println(Time is:(System.currentTimeMillis()-start) ms);}Elements elem doc.getElementsByTag(Title);System.out.println(Title is: elem.text());}}输出为Time is:4158ms Title is:顺义天气预报-今日_明日_一周天气预报:16日星期五 多云转晴 11/-4℃