AI导航网

技术频道

公众号推荐

微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦！

Java中如何在不使用任何外部库的情况下读取网页内容？

时间：2023-09-11分类：Java作者：编程之家

Java中如何在不使用任何外部库的情况下读取网页内容？

The URL class of the java.net package represents a Uniform Resource Locator which is used to point a resource (file or, directory or a reference) in the world wide web.

The openStream() method of this class opens a connection to the URL represented by the current object and returns an InputStream object using which you can read data from the URL.

Therefore, to read data from web page (using the URL class) −

Instantiate the java.net.URL class by passing the URL of the desired web page as a parameter to its constructor.
Invoke the openStream() method and retrieve the InputStream object.
Instantiate the Scanner class by passing the above retrieved InputStream object as a parameter.

Example

import java.io.IOException;
import java.net.URL;
import java.util.Scanner;
public class ReadingWebPage {
   public static void main(String args[]) throws IOException {
      //Instantiating the URL class
      URL url = new URL("http://www.something.com/");
      //Retrieving the contents of the specified page
      Scanner sc = new Scanner(url.openStream());
      //Instantiating the StringBuffer class to hold the result
      StringBuffer sb = new StringBuffer();
      while(sc.hasNext()) {
         sb.append(sc.next());
         //System.out.println(sc.next());
      }
      //Retrieving the String from the String Buffer object
      String result = sb.toString();
      System.out.println(result);
      //Removing the HTML tags
      result = result.replaceAll("<[^>]*>", "");
      System.out.println("Contents of the web page: "+result);
   }
}

输出

<html><body><h1>Itworks!</h1></body></html>
Contents of the web page: Itworks!

以上就是Java中如何在不使用任何外部库的情况下读取网页内容？的详细内容，更多请关注编程之家其它相关文章！

版权声明：本文内容由互联网用户自发贡献，该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至 [email protected] 举报，一经查实，本站将立刻删除。

相关推荐

String真的不可变吗？

Java中的String是不可变对象在面向对象及函数编程语言中，不可变对象（英语：Immutable object）是一种对象，在被创造之后，它的状态就不可以被改变。至于状态可以被改变的对象，则被称为可变对象（英语：mutable o...

作者：seven97_top 时间：2024-10-24

String, StringBuffer 和 StringBuilder之间的区别

String, StringBuffer 和 StringBuilder 可变性 String不可变 StringBuffer 和 StringBuilder 可变线程安全 String 不可变，因此是线程安全的 StringBuilder不是线程安全的 StringBuffer 是线程安全的，内

作者：seven97_top 时间：2024-10-24

讲讲Java的序列化反序列化？

序列化：把对象转换为字节序列的过程称为对象的序列化. 反序列化：把字节序列恢复为对象的过程称为对象的反序列化. 什么时候会用到当只在本地 JVM 里运行下 Java 实例，这个时候是不需要什么序列化和反序列化的，但...

作者：seven97_top 时间：2024-10-24

数组到底是不是对象

先说结论，是对象！可以继续往下看数组是不是对象什么是对象？对象是类的一个实例，有状态和行为 Java对象：软件的对象也有行为和状态软件对象的状态称之为属性方法操作对象内部状态的改变，对象的相互调用也是...

作者：seven97_top 时间：2024-10-24

金融、支付行业的开发者不得不知道的float、double计算误差问题

为什么浮点数 float 或 double 运算的时候会有精度丢失的风险呢？《阿里巴巴 Java 开发手册》中提到：“浮点数之间的等值判断，基本数据类型不能用 == 来比较，包装数据类型不能用 equals 来判断”。“为了避免精度...

作者：seven97_top 时间：2024-10-24

浅谈Integer缓存机制原理

面试题引入这里引申出一个经典问题，看下面代码 Integer a = 100; Integer b = 100; System.out.println(a == b);//true Integer c = 200; Integer d = 200; System.out.println(c ==

作者：seven97_top 时间：2024-10-24

能否自定义一个String类使用

先说下结论，可以自定义包名不为java.lang的String类，区别包名是可以正常使用的。包名不为java.lang package com.seven.jvm; public final class String { /** The value is used for character st

作者：seven97_top 时间：2024-10-24

一文讲清楚static关键字

static能修饰的地方静态变量静态变量: 又称为类变量，也就是说这个变量属于类的，类所有的实例都共享静态变量，可以直接通过类名来访问它；静态变量在内存中只存在一份。实例变量: 每创建一个实例就会产生一个实例...

作者：seven97_top 时间：2024-10-24

String究竟能存储多少字符？

能存储多少字符，通过以下步骤来看首先String的length方法返回是int。所以理论上长度一定不会超过int的最大值。编译器对字符串字面量长度的限制源自Java编译器（如javac）在处理常量池时的实现。编译器源码如下，限...

作者：seven97_top 时间：2024-10-24

解决哈希冲突的三种方法

为什么会哈希冲突我们知道，在使用Map，Set这些集合时，都会重写hashcode方法，但Java中的hashCode方法会将对象映射到一个32位的整数范围（即从-2^31 到 2^31-1）。无论输入数据多么庞大，哈希函数生成的哈希值总是...

作者：seven97_top 时间：2024-10-24