This is a draft cheat sheet. It is a work in progress and is not finished yet.
BeatifulSoup (parse a website)
from bs4 import BeautifulSoup
|
import module |
soup = BeautifulSoup(html, "lmxl")
|
create new object |
|
finds first <a> tags |
S_all = soup.find_all('a')
|
finds all <a> tags, saves as list |
S_list = soup.find_all(['a', 'b'])
|
finds all <a> and <b> tags |
S_all_tags = soup.find_all(True)
|
finds all tags <> and no text |
Requests
|
import module |
|
get html |
|
url content in object form |
|
|
QT (Graphical Interface)
Selenium (automated web experience)
JSON (write or read from JSON file)
|
import module |
|
saves dictionary to JSON format |
J_sorted = json.dumps(data, sort_keys = True)
|
saves dictionary to JSON format but in the order |
|