Cheatography
https://cheatography.com
Basic commands and explanation for HTML parsing with the Java library Jsoup using CSS or Jquery-like Selectors
Selector overview
* |
All elements |
document.select("*") |
tagname |
Find elements by tag |
document.select("h1") |
#id |
Find elements by ID |
document.select("#subtitle") |
.class |
Find elements by class name |
document.select(".list") |
|
Attribute Selection
[attributre] |
Elements with attribute |
document.select("[href]") |
[^attr] |
Elements with an attribute name prefix |
document.select("[^data-]") |
[attr=value] |
Elements with an attribute value |
document.select("[width=100]") |
[attr^=value] |
Elements with attributes that start with... |
document.select("[class^=button]") |
[attr$=value] |
Elements with attributes that end with... |
document.select("[href$=example.com]") |
[attr*=value] |
Elements with attributes that contains the value... |
document.select("[class*=button]") |
[attr~=regex] |
Elements with attributes that match the regular expression |
document.select("img[src~=(?i)\\.(png|jpe?g)]") |
|
Pseudo selectors
:lt(n) |
Find elements whose siblings index is less than n |
document.select("td:lt(3)") |
:gt(n) |
Find elements whose sibling index is greater than n |
document.select("td:gt(2)") |
:eq(n) |
Find elements whose sibling index is equal to n |
document.select("td:eq(2)") |
:has(selector) |
Find elements that contain elements matching the selector |
document.select("li:has(a)") |
:not(selector) |
Find elements that do not match the selector |
document.select("li:not(#justLink)") |
:contains(text) |
Find element that contain the given text. (case-sesitive) |
document.select(":contains(world)") |
:contains(text) |
Find elements that directly contain the given text |
document.select(":containsOwn(world) |
:matches(regex) |
Find elements whose text matches the specified regular expression |
document.select(":matches(^Button 1$)") |
:matchesOwn(regex) |
Find elements whose own text matches the spicified regular expression |
document.select(":matchesOwn(1)") |
|
Selector combinatios
el#id |
Element with ID |
document.select("li#justLink") |
el.class |
Elements with class |
document.select("li.sale") |
el[attr] |
Elements with attribute |
document.select("li[data-price]") |
el[attr][attr].class ... |
Any combination |
document.select("img[src][width]") |
|
Navigation Through the DOM
ancestor child |
Child elements that descend form ancestor |
document.select("ul li a") |
parent > child |
Child elements that descend directly from parent |
document.select("body > ul > li > ul > li > a") |
siblingA + siblingB |
Sibling B element immediately preceded by sibling A |
document.select(".child1 + .child2") |
siblingA ~ siblingB |
Sibling X element preceded by Sibling A |
document.select(".child1 ~ div") |
el, el, el ... |
Unique elements that match any of the selectors |
document.select("div.masthead, div.logo") |
|
Created By
Metadata
Comments
No comments yet. Add yours below!
Add a Comment