Show Menu

Files and Internet Scrapping Python Cheat Sheet by

This is a cheat sheet for programming in python


Code point higher than 127
"­'\u­091­5'" or "­क"

File Modes

position at start
position at end

Reading a File

Opens and closing a file
file = open(f­ile­_name, encoding = 'utf-8', 'r')
with open(f­ile­_name, "­r") as file:
Reading a file
Reads entire file as a string
Reads the first 10 charcters
___content = file.r­ead(10)
Stores contents as a list of lines
Read next line as a string
Iterating through lines
for line in lines:
strip to remove newline charc
Writing to a text file
with open(f­ile­_name, 'w') as file:
writing one line
___fil­e.w­rit­e("H­ello, World!­\n")
writing list of lines
Appending to a text file
with open(f­ile­_name, 'a') as file:
___fil­e.w­rit­e("­\nAp­pending a new line.")
Checking file existence
import os
Handling file exceptions
___with open(file) as f:
_____c­ontent =
except FileNo­tFo­und­Error:
___pri­nt(­"­"File not found")
except Exception as e:
___pri­nt(­f"An error occurred: {e}")
Tellin­g/s­eeking in a file
with open(file) as f:
Gives current position
Move the cursor to the beginning

csv Library

import csv
Reading CSV file
returns a list
content =­ade­r(file, delimiter = "­;")
data = list(c­ontent)
Writing to CSV file
writer = csv.wr­ite­r(file)

Loading JSONs

import json
Loading json file
json_data = json.l­oad­s(j­son­_file)
Change json to string
string = json.d­ump­s(data, indent=2)

Random library

import random
Returns a random float in the range[­0,0,1)
Return a random float in the range[a,b]
Returns an integer in the range [0,b)
Returns an integer in the range [a,b) skipping c steps
Returns an integer in the range [a,b]
Randomly change position of elements
print (rando­m.s­huf­fle­(luck))

Random Choices

Uniformly randomly picks one item from a list­oic­e(list)
Selects k items without repetition­mpl­e(list, k=2)
Selects k items with repetition­oic­es(­list, k=2)
Selects k items without uniformity may repeat­oic­es(­list, weight­s=[­10,­150­,20], k =2)
These methods also work on strings

Raw Strings

Raw Strings
r"A raw string­"
only a single backslash not valid
odd number of ending backslash not valid

Regular Expres­sions- Patterns

Regular - Expression Patterns (conti­nued)

Match its preceding element zero or more times.
Match its preceding element one or more times.
Match its preceding element zero or one time.
Match its preceding element exactly n times.
{n ,m}
Match its preceding element from n to m times.

re Methods

Searches the string for a match and returns a Match object
re.mat­ch(­pat­tern, string)
Searches for the first occurrence of the pattern anywhere in the string
Finds all occurr­ences of the pattern in the string returns a list
Returns an iterator yielding match objects for all matches.
re.fin­dit­er(­pat­tern, string)
Replaces matching substrings with new string for all occurr­ences or a specified number
re.sub­(pa­ttern, replac­ement, string)
Splits the string where there is a match and returns list of strings based on splits
re.spl­it(­pat­tern, string)

Get requests using requests

import requests
url = ""
r = requests.get(url)   
text = r.text


from bs4 import BeautifulSoup

# Parse HTML stored as a string
# 'html5lib' 'html.parser' or 'lxml'
soup = BeautifulSoup(html, 'html5lib')

# Returns formatted html

# Find the first instance of an HTML tag
soup.find(tag, attrs={"class":"__"})

# Find all instances of an HTML tag
soup.find_all(tag, attrs={"class":"__"})

Get requests using urllib

from urllib.request import urlopen, Request
url = ""
request = Request(url)
response = urlopen(request)
html =

Higher Order Functions

# min/max

max(iterable[, default=obj, key=func])
min(iterable[, default=obj, key=func])

a = min([12,"apple",223,"A","B"],key= lambda c: len(str(c)))
# Output: "A"   (minimum value in the list based on len(str) of the list object)

students = [{"name":"Saint", "age":"25"},
{"name":"Watson", "age":"35"},
{"name":"Karlson", "age":"21"},
{"name":"Kenzo", "age":"15"}]

youngest= min(students, key= lambda x:x["age"] ) # Output: {"name":"Kenzo", "age":"15"}]
oldest = max(students, key= lambda x:x["age"] )  # Output: {"name":"Watson", "age":"35"}


sorted(iterable, key=func, reverse=reverse)

L =["apple", "ban", "dog", "aeroplane"]
print(sorted(L,key=len))  # Sort based on lenght of string


map(function, iterable, ...)

applies a function to the iterable and returns a mapped object 

Use print(list(map)) to print the value!

# A function to return the square of n
def addition(n):
   return n**2

# Some iterable
list = [1,2,3,4]

#map the function with the iterable and apply list to the map object

print(list(map(addition,list)))  # Output: [1,4,9,16]

# Filter

map(function, iterable, ...)

The filter runs through each element of iterable and applies function to it.
It filters out list elements for which function doesnt give a True value

seq = [0, 1, 2, 3, 5, 8, 13]
# result filters out non odd numbers
result = filter(lambda x: x % 2 != 0, seq)
print(list(result)) # Output: [1,3,5,13]


No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          jQuery Cheat Sheet
          HTML5 deutsch Cheat Sheet

          More Cheat Sheets by leenmajz

          Docker Cheat Sheet
          Python Basics Cheat Sheet
          Shell Scripting Basics Cheat Sheet