The URL class of the java.net package represents a Uniform Resource Locator which is used to point a resource (file or, directory or a reference) in the world wide web.
The openStream() method of this class opens a connection to the URL represented by the current object and returns an InputStream object using which you can read data from the URL.
Therefore, to read data from web page (using the URL class) −
Instantiate the java.net.URL class by passing the URL of the desired web page as a parameter to its constructor.
Invoke the openStream() method and retrieve the InputStream object.
Instantiate the Scanner class by passing the above retrieved InputStream object as a parameter.
Example
import java.io.IOException; import java.net.URL; import java.util.Scanner; public class ReadingWebPage { public static void main(String args[]) throws IOException { //Instantiating the URL class URL url = new URL("http://www.something.com/"); //Retrieving the contents of the specified page Scanner sc = new Scanner(url.openStream()); //Instantiating the StringBuffer class to hold the result StringBuffer sb = new StringBuffer(); while(sc.hasNext()) { sb.append(sc.next()); //System.out.println(sc.next()); } //Retrieving the String from the String Buffer object String result = sb.toString(); System.out.println(result); //Removing the HTML tags result = result.replaceAll("<[^>]*>", ""); System.out.println("Contents of the web page: "+result); } }
输出
<html><body><h1>Itworks!</h1></body></html> Contents of the web page: Itworks!
以上就是Java中如何在不使用任何外部库的情况下读取网页内容?的详细内容,更多请关注编程之家其它相关文章!
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。