Cheatography
https://cheatography.com
Importing Text Files I
|
open the file |
|
read the file |
|
close the file |
|
check if the file is closed |
It is a good practice to close the file after reading it when using 'open'
Importing Text Files II
with open(file_name) as file :
|
open the file |
|
read the file |
file.readline() |
read line by line |
When using the 'with' statement there is no need to close the file
Importing Flat Files with Numpy I
|
import numpy |
np.loadtxt(file_name, delimiter= ' ')
|
importing the file |
|
argument to skip a specific row |
|
argument to only show specific columns |
`dtype = str' |
argument to import the data as string |
loadtxt only works with numeric data
Importing Flat Files with Numpy II
|
import numpy |
np.recfromcsv(file, delimiter=",", names=True, dtype=None)
|
open the file |
np.genfromtxt(file, delimiter=',', names=True, dtype=None)
|
open the file |
with the functions recfromcsv() and genfromtxt() we are able to import data with different types
Importing Stata Files
|
importing pandas |
df = pd.read_stata('disarea.dta')
|
reading the stata file |
|
|
Importing Flat Files With Pandas
|
import pandas |
|
open csv file |
|
argument for the number of rows to load |
|
argument for no header |
|
argument to set delimiter |
|
argument takes characters that comments occur after in the file |
|
argument to recognize a string as a NaN Value |
Import pickled files
|
import the library |
with open(file_name, 'rb') as file :
|
open file |
|
read file |
Importing Spreadsheet Files
|
importing pandas |
|
opening the file |
|
exporting the sheet names |
xl.parse(sheet_name/index)
|
loading a sheet to a dataframe |
|
skipping a specific row |
|
naming the sheet's columns |
|
parse spesific columns |
skiprows, names and useclos are all arguments of the function parse()
Importing SAS Files
from sas7bdat import SAS7BDAT
|
importing sas7bdat library |
|
importing pandas |
with SAS7BDAT('file_name') as file:
|
opening the file |
|
loading the file as dataframe |
|
|
Importing HDF5 files
|
import numpy |
|
importing the h5py library |
|
reading the file |
Importing MATLAB files
|
importing scipy.io |
cipy.io.loadmat('file_name')
|
reading the file |
Relational databases I
|
importing pandas |
from sqlalchemy import create_engine
|
importing the necessary library |
engine = create_engine('databasetype:///name.databasetype')
|
creating an engine |
|
connecting to the engine |
rs = con.execute('SELECT * FROM Album')
|
performe query |
df = pd.DataFrame(rs.fetchall())
|
save as a dataframe |
|
set columns names |
|
close the connection |
The best practice is to close the connection
Relational databases II
engine = create_engine('databasetype:///name.databasetype')
|
creating an engine |
with engine.connect() as con:
|
connecting to the engine |
rs = con.execute('sql code')
|
performe query |
df = pd.DataFrame(rs.fetchmany(size=3))
|
load a number of rows as a dataframe |
With 'open' you don't have to close the connection at the end
Relational databases III
engine = create_engine('databasetype:///name.databasetype')
|
creating an engine |
df = pd.read_sql_query('SQL code', engine)
|
performe query |
Fastest way to connect to a database and perform query
|
Created By
Metadata
Favourited By
Comments
No comments yet. Add yours below!
Add a Comment
Related Cheat Sheets