https://steemitimages.com/DQmYstyySFoVnL2Xve8F6ZFF3ukGBkvNdLamgFi86kYDj1Q/23.png
What Will You Learn?
In this tutorial you will learn about jsoup. Its basic elements and development.
What is jsoup
jsoup is a Java based library to work with HTML based content. It provides a very convenient API to extract and manipulate data, using the best of DOM, CSS, and jquery-like methods. It implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.
Requirements
Basic Java Programming
Good OOP Concept is a plus point for you
Difficulty
Intermediate
jsoup - Overview
jsoup is a Java based library to work with HTML based content. It provides a very convenient API to extract and manipulate data, using the best of DOM, CSS, and jquery-like methods. It implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.
jsoup libary implements the WHATWG HTML5 specification, and parses an HTML content to the same DOM as per the modern browsers.
jsonp library provides following functionalities.
Multiple Read Support - It reads and parses HTML using URL, file, or string.
CSS Selectors It can find and extract data, using DOM traversal or CSS selectors.
DOM Manipulation It can manipulate the HTML elements, attributes, and text.
Prevent XSS attacksIt can clean user-submitted content against a given safe white-list, to prevent XSS attacks.
TidyIt outputs tidy HTML.
Handles invalid data - jsoup can handle unclosed tags, implicit tags and can reliably create the document structure.
https://steemitimages.com/DQmSV5FpAUUi6Hm2UuUeAUrUyTEjThkmiAi8f9WWxXk5gsz/Screenshot_15.png
Local Environment Setup
JUnit is a framework for Java, so the very first requirement is to have JDK installed in your machine.
System Requirement
https://steemitimages.com/DQmUULA1kWm5PaYmPZf9SqFcoPzj6VYTVPQjpqLgs8TXh1T/Screenshot_1.png
Step 1: Verify Java Installation in Your Machine
https://steemitimages.com/DQmQWcRj5JBLFZt9dGidNEVEXCriYo9QbGHrVqR6VfRV4xR/1.png
https://steemitimages.com/DQmTmFVFZcYDDRFF6wQYF8c4NiGb8W7TWbVV19ZPSdrmT4N/2.png
Step 2: Set JAVA Environment
https://steemitimages.com/DQmfF7gv1BhBDhpQZHCW8gdBtjQufKNfWb8GZiLSRGAs7Nz/21.png
Step 3: Download jsoup Archive
https://steemitimages.com/DQmb826mDfUzAv4sbpRbwFFpiQAyzrZMji3TGF68D61QZoT/3.png
Step 4: Set jsoup Environment
https://steemitimages.com/DQmZo3H1A6mWAv4A8CRj3V9YjEgq47PEjZbdKvpPp4R3UiD/4.png
Step 5: Set CLASSPATH Variable
[IMAGE: https://steemitimages.com/DQmNuqM72FwREWZLtUkzANKmuJ64HVxqhg7jn6c7R8Ty83n/5.png]
Source: Tutorialspoint.com