Show Menu
Cheatography

Files and Internet Scrapping Python Cheat Sheet by

This is a cheat sheet for programming in python

Unicode

Code point higher than 127
"­'\u­091­5'" or "­क"

File Modes

 
r
r+
w
w+
a
a+
read
*
*
 
*
 
*
write
 
*
*
*
*
*
create
   
*
*
*
*
truncate
   
*
*
position at start
*
*
*
*
position at end
       
*
*

Reading a File

Opens and closing a file
file = open(f­ile­_name, encoding = 'utf-8', 'r')
with open(f­ile­_name, "­r") as file:
file.c­lose()
Reading a file
Reads entire file as a string
___con­ten­t=f­ile.read()
Reads the first 10 charcters
___content = file.r­ead(10)
Stores contents as a list of lines
___fil­e.r­ead­lines()
Read next line as a string
___fil­e.r­ead­line()
Iterating through lines
 
for line in lines:
strip to remove newline charc
___pri­nt(­lin­e.s­trip())
Writing to a text file
 
with open(f­ile­_name, 'w') as file:
writing one line
___fil­e.w­rit­e("H­ello, World!­\n")
writing list of lines
___fil­e.w­rit­eli­nes­(list)
Appending to a text file
with open(f­ile­_name, 'a') as file:
___fil­e.w­rit­e("­\nAp­pending a new line.")
Checking file existence
import os
os.pat­h.e­xis­ts(­fil­e_name)
Handling file exceptions
try:
___with open(file) as f:
_____c­ontent = f.read()
___pri­nt(­con­tent)
except FileNo­tFo­und­Error:
___pri­nt(­"­"File not found")
except Exception as e:
___pri­nt(­f"An error occurred: {e}")
Tellin­g/s­eeking in a file
 
with open(file) as f:
Gives current position
___f.t­ell()
Move the cursor to the beginning
___f.s­eek(0)

csv Library

import csv
Reading CSV file
returns a list
content = csv.re­ade­r(file, delimiter = "­;")
 
data = list(c­ontent)
Writing to CSV file
 
writer = csv.wr­ite­r(file)
 
writer.wr­ite­row­s(list)

Loading JSONs

import json
Loading json file
json_data = json.l­oad­s(j­son­_file)
Change json to string
string = json.d­ump­s(data, indent=2)
 

Random library

 
import random
Returns a random float in the range[­0,0,1)
random.ra­ndom()
Return a random float in the range[a,b]
random.un­ifo­rm(a,b)
Returns an integer in the range [0,b)
random.ra­ndr­ange(b)
Returns an integer in the range [a,b) skipping c steps
random.ra­ndr­ang­e(a­,b,c)
Returns an integer in the range [a,b]
random.ra­ndi­nt(a,b)
Randomly change position of elements
print (rando­m.s­huf­fle­(luck))

Random Choices

Uniformly randomly picks one item from a list
random.ch­oic­e(list)
 
list[r­and­ran­ge(4)]
Selects k items without repetition
random.sa­mpl­e(list, k=2)
Selects k items with repetition
random.ch­oic­es(­list, k=2)
Selects k items without uniformity may repeat
random.ch­oic­es(­list, weight­s=[­10,­150­,20], k =2)
These methods also work on strings

Raw Strings

Raw Strings
r-strings
r"A raw string­"
only a single backslash not valid
odd number of ending backslash not valid

Regular Expres­sions- Patterns

Regular - Expression Patterns (conti­nued)

*
Match its preceding element zero or more times.
+
Match its preceding element one or more times.
?
Match its preceding element zero or one time.
{n}
Match its preceding element exactly n times.
{n ,m}
Match its preceding element from n to m times.

re Methods

Searches the string for a match and returns a Match object
re.mat­ch(­pat­tern, string)
Searches for the first occurrence of the pattern anywhere in the string
re.sea­rch­(pa­tte­rn,­string)
Finds all occurr­ences of the pattern in the string returns a list
re.fin­dal­l(p­att­ern­,st­ring)
Returns an iterator yielding match objects for all matches.
re.fin­dit­er(­pat­tern, string)
Replaces matching substrings with new string for all occurr­ences or a specified number
re.sub­(pa­ttern, replac­ement, string)
Splits the string where there is a match and returns list of strings based on splits
re.spl­it(­pat­tern, string)
 

Get requests using requests

import requests
url = "https://www.wikipedia.org/"
r = requests.get(url)   
text = r.text

Webscr­apping

from bs4 import BeautifulSoup

# Parse HTML stored as a string
# 'html5lib' 'html.parser' or 'lxml'
soup = BeautifulSoup(html, 'html5lib')

# Returns formatted html
soup.prettify()

# Find the first instance of an HTML tag
soup.find(tag, attrs={"class":"__"})

# Find all instances of an HTML tag
soup.find_all(tag, attrs={"class":"__"})

Get requests using urllib

from urllib.request import urlopen, Request
url = "https://www.wikipedia.org/"
request = Request(url)
response = urlopen(request)
html = response.read()
response.close()

Higher Order Functions

# min/max

max(iterable[, default=obj, key=func])
min(iterable[, default=obj, key=func])

a = min([12,"apple",223,"A","B"],key= lambda c: len(str(c)))
# Output: "A"   (minimum value in the list based on len(str) of the list object)


students = [{"name":"Saint", "age":"25"},
{"name":"Watson", "age":"35"},
{"name":"Karlson", "age":"21"},
{"name":"Kenzo", "age":"15"}]

youngest= min(students, key= lambda x:x["age"] ) # Output: {"name":"Kenzo", "age":"15"}]
oldest = max(students, key= lambda x:x["age"] )  # Output: {"name":"Watson", "age":"35"}



#sorted

sorted(iterable, key=func, reverse=reverse)

L =["apple", "ban", "dog", "aeroplane"]
print(sorted(L,key=len))  # Sort based on lenght of string


#map 

map(function, iterable, ...)

applies a function to the iterable and returns a mapped object 

Use print(list(map)) to print the value!

# A function to return the square of n
def addition(n):
   return n**2

# Some iterable
list = [1,2,3,4]

#map the function with the iterable and apply list to the map object

print(list(map(addition,list)))  # Output: [1,4,9,16]


# Filter

map(function, iterable, ...)

The filter runs through each element of iterable and applies function to it.
It filters out list elements for which function doesnt give a True value


seq = [0, 1, 2, 3, 5, 8, 13]
 
# result filters out non odd numbers
result = filter(lambda x: x % 2 != 0, seq)
print(list(result)) # Output: [1,3,5,13]
           
 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          jQuery Cheat Sheet
          HTML5 deutsch Cheat Sheet

          More Cheat Sheets by leenmajz

          Docker Cheat Sheet
          Python Basics Cheat Sheet
          Shell Scripting Basics Cheat Sheet