二级java:httpclient抓取网页内容
来源:优易学  2011-10-29 12:18:25   【优易学:中国教育考试门户网】   资料下载   IT书店

  1.想下载远程URL地址的内容。可以使用httpclient现在整理一下相关的代码:

  而且解决中文乱码问题

  方法一:流转码

  public String convertStreamToString(InputStream is) throws UnsupportedEncodingException {

  BufferedReader reader = new BufferedReader(new InputStreamReader(is,"gbk"));

  StringBuilder sb = new StringBuilder();

  String line = null;

  try {

  while ((line = reader.readLine()) != null) { sb.append(line + "\n");

  }

  } catch (IOException e) {

  e.printStackTrace();

  } finally {

  try {

  is.close();

  } catch (IOException e) {

  e.printStackTrace();

  }

  }

  return sb.toString();

  }

  //下载内容

  private String urlContent(String urlString) throws HttpException, IOException {

  HttpClient client = new HttpClient();

  GetMethod get = new GetMethod("http://www.tianya.cn/publicforum/articleslist/0/no20.shtml"); client.executeMethod(get); System.out.print(get.getResponseCharSet()); InputStream iStream = get.getResponseBodyAsStream();

  String contentString = convertStreamToString(iStream);

  get.releaseConnection();

  return contentString;

  }

责任编辑:小草

文章搜索:
 相关文章
热点资讯
资讯快报
热门课程培训