由 neevop 十二月 10, 2022
JSON
Text file format for storing collections of strings and numbers.
import json
<str> = json.dumps(<object>) # Converts object to JSON string.
<object> = json.loads(<str>) # Converts JSON string to object.
Read Object from JSON File
def read_json_file(filename):
with open(filename, encoding='utf-8') as file:
return json.load(file)
Write Object to JSON File
def write_to_json_file(filename, an_object):
with open(filename, 'w', encoding='utf-8') as file:
json.dump(an_object, file, ensure_ascii=False, indent=2)
Pickle
Binary file format for storing Python objects.
import pickle
<bytes> = pickle.dumps(<object>) # Converts object to bytes object.
<object> = pickle.loads(<bytes>) # Converts bytes object to object.
Read Object from File
def read_pickle_file(filename):
with open(filename, 'rb') as file:
return pickle.load(file)
Write Object to File
def write_to_pickle_file(filename, an_object):
with open(filename, 'wb') as file:
pickle.dump(an_object, file)
CSV
Text file format for storing spreadsheets.
import csv
Read
<reader> = csv.reader(<file>) # Also: `dialect='excel', delimiter=','`.
<list> = next(<reader>) # Returns next row as a list of strings.
<list> = list(<reader>) # Returns a list of remaining rows.
- File must be opened with a
'newline=""'
argument, or newlines embedded inside quoted fields will not be interpreted correctly! - To print the spreadsheet to the console use Tabulate library.
- For XML and binary Excel files (xlsx, xlsm and xlsb) use Pandas library.
Write
<writer> = csv.writer(<file>) # Also: `dialect='excel', delimiter=','`.
<writer>.writerow(<collection>) # Encodes objects using `str(<el>)`.
<writer>.writerows(<coll_of_coll>) # Appends multiple rows.
- File must be opened with a
'newline=""'
argument, or ‘\r’ will be added in front of every ‘\n’ on platforms that use ‘\r\n’ line endings!
Parameters
'dialect'
- Master parameter that sets the default values. String or a Dialect object.'delimiter'
- A one-character string used to separate fields.'quotechar'
- Character for quoting fields that contain special characters.'doublequote'
- Whether quotechars inside fields are/get doubled or escaped.'skipinitialspace'
- Is space character at the start of the field stripped by the reader.'lineterminator'
- How writer terminates rows. Reader is hardcoded to ‘\n’, ‘\r’, ‘\r\n’.'quoting'
- 0: As necessary, 1: All, 2: All but numbers which are read as floats, 3: None.'escapechar'
- Character for escaping quotechars if doublequote is False.
Dialects
+------------------+--------------+--------------+--------------+
| | excel | excel-tab | unix |
+------------------+--------------+--------------+--------------+
| delimiter | ',' | '\t' | ',' |
| quotechar | '"' | '"' | '"' |
| doublequote | True | True | True |
| skipinitialspace | False | False | False |
| lineterminator | '\r\n' | '\r\n' | '\n' |
| quoting | 0 | 0 | 1 |
| escapechar | None | None | None |
+------------------+--------------+--------------+--------------+
Read Rows from CSV File
def read_csv_file(filename, dialect='excel'):
with open(filename, encoding='utf-8', newline='') as file:
return list(csv.reader(file, dialect))
Write Rows to CSV File
def write_to_csv_file(filename, rows, dialect='excel'):
with open(filename, 'w', encoding='utf-8', newline='') as file:
writer = csv.writer(file, dialect)
writer.writerows(rows)
SQLite
A server-less database engine that stores each database into a separate file.
import sqlite3
<conn> = sqlite3.connect(<path>) # Opens existing or new file. Also ':memory:'.
<conn>.close() # Closes the connection.
Read
<cursor> = <conn>.execute('<query>') # Can raise a subclass of sqlite3.Error.
<tuple> = <cursor>.fetchone() # Returns next row. Also next(<cursor>).
<list> = <cursor>.fetchall() # Returns remaining rows. Also list(<cursor>).
Write
<conn>.execute('<query>') # Can raise a subclass of sqlite3.Error.
<conn>.commit() # Saves all changes since the last commit.
<conn>.rollback() # Discards all changes since the last commit.
Or:
with <conn>: # Exits the block with commit() or rollback(),
<conn>.execute('<query>') # depending on whether any exception occurred.
Placeholders
<conn>.execute('<query>', <list/tuple>) # Replaces '?'s in query with values.
<conn>.execute('<query>', <dict/namedtuple>) # Replaces ':<key>'s with values.
<conn>.executemany('<query>', <coll_of_above>) # Runs execute() multiple times.
- Passed values can be of type str, int, float, bytes, None, bool, datetime.date or datetime.datetime.
- Bools will be stored and returned as ints and dates as ISO formatted strings.
Example
Values are not actually saved in this example because 'conn.commit()'
is omitted!
>>> conn = sqlite3.connect('test.db')
>>> conn.execute('CREATE TABLE person (person_id INTEGER PRIMARY KEY, name, height)')
>>> conn.execute('INSERT INTO person VALUES (NULL, ?, ?)', ('Jean-Luc', 187)).lastrowid
1
>>> conn.execute('SELECT * FROM person').fetchall()
[(1, 'Jean-Luc', 187)]
SqlAlchemy
# $ pip3 install sqlalchemy
from sqlalchemy import create_engine, text
<engine> = create_engine('<url>').connect() # Url: 'dialect://user:password@host/dbname'.
<conn> = <engine>.connect() # Creates a connection. Also <conn>.close().
<cursor> = <conn>.execute(text('<query>'), …) # Replaces ':<key>'s with keyword arguments.
with <conn>.begin(): ... # Exits the block with commit or rollback.
+------------+--------------+-----------+-----------------------------------+
| Dialects | pip3 install | import | Dependencies |
+------------+--------------+-----------+-----------------------------------+
| mysql | mysqlclient | MySQLdb | www.pypi.org/project/mysqlclient |
| postgresql | psycopg2 | psycopg2 | www.psycopg.org/docs/install.html |
| mssql | pyodbc | pyodbc | apt install g++ unixodbc-dev |
| oracle | cx_oracle | cx_Oracle | Oracle Instant Client |
+------------+--------------+-----------+-----------------------------------+
Bytes
Bytes object is an immutable sequence of single bytes. Mutable version is called bytearray.
<bytes> = b'<str>' # Only accepts ASCII characters and \x00-\xff.
<int> = <bytes>[<index>] # Returns an int in range from 0 to 255.
<bytes> = <bytes>[<slice>] # Returns bytes even if it has only one element.
<bytes> = <bytes>.join(<coll_of_bytes>) # Joins elements using bytes as a separator.
Encode
<bytes> = bytes(<coll_of_ints>) # Ints must be in range from 0 to 255.
<bytes> = bytes(<str>, 'utf-8') # Or: <str>.encode('utf-8')
<bytes> = <int>.to_bytes(n_bytes, …) # `byteorder='little/big', signed=False`.
<bytes> = bytes.fromhex('<hex>') # Hex pairs can be separated by whitespaces.
Decode
<list> = list(<bytes>) # Returns ints in range from 0 to 255.
<str> = str(<bytes>, 'utf-8') # Or: <bytes>.decode('utf-8')
<int> = int.from_bytes(<bytes>, …) # `byteorder='little/big', signed=False`.
'<hex>' = <bytes>.hex() # Returns hex pairs. Accepts `sep=<str>`.
Read Bytes from File
def read_bytes(filename):
with open(filename, 'rb') as file:
return file.read()
Write Bytes to File
def write_bytes(filename, bytes_obj):
with open(filename, 'wb') as file:
file.write(bytes_obj)
Struct
- Module that performs conversions between a sequence of numbers and a bytes object.
- System’s type sizes, byte order, and alignment rules are used by default.
from struct import pack, unpack
<bytes> = pack('<format>', <el_1> [, ...]) # Packages arguments into bytes object.
<tuple> = unpack('<format>', <bytes>) # Use iter_unpack() for iterator of tuples.
>>> pack('>hhl', 1, 2, 3)
b'\x00\x01\x00\x02\x00\x00\x00\x03'
>>> unpack('>hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)
Format
For standard type sizes and manual alignment (padding) start format string with:
'='
- System’s byte order (usually little-endian).'<'
- Little-endian.'>'
- Big-endian (also'!'
).
Besides numbers, pack() and unpack() also support bytes objects as part of the sequence:
'c'
- A bytes object with a single element. For pad byte use'x'
.'<n>s'
- A bytes object with n elements.
Integer types. Use a capital letter for unsigned type. Minimum and standard sizes are in brackets:
'b'
- char (1/1)'h'
- short (2/2)'i'
- int (2/4)'l'
- long (4/4)'q'
- long long (8/8)
Floating point types:
'f'
- float (4/4)'d'
- double (8/8)
Array
List that can only hold numbers of a predefined type. Available types and their minimum sizes in bytes are listed above. Sizes and byte order are always determined by the system.
from array import array
<array> = array('<typecode>', <collection>) # Array from collection of numbers.
<array> = array('<typecode>', <bytes>) # Array from bytes object.
<array> = array('<typecode>', <array>) # Treats array as a sequence of numbers.
<bytes> = bytes(<array>) # Or: <array>.tobytes()
<file>.write(<array>) # Writes array to the binary file.
Memory View
- A sequence object that points to the memory of another object.
- Each element can reference a single or multiple consecutive bytes, depending on format.
- Order and number of elements can be changed with slicing.
- Casting only works between char and other types and uses system’s sizes.
- Byte order is always determined by the system.
<mview> = memoryview(<bytes/bytearray/array>) # Immutable if bytes, else mutable.
<real> = <mview>[<index>] # Returns an int or a float.
<mview> = <mview>[<slice>] # Mview with rearranged elements.
<mview> = <mview>.cast('<typecode>') # Casts memoryview to the new format.
<mview>.release() # Releases the object's memory buffer.
Decode
<bytes> = bytes(<mview>) # Returns a new bytes object.
<bytes> = <bytes>.join(<coll_of_mviews>) # Joins mviews using bytes object as sep.
<array> = array('<typecode>', <mview>) # Treats mview as a sequence of numbers.
<file>.write(<mview>) # Writes mview to the binary file.
<list> = list(<mview>) # Returns a list of ints or floats.
<str> = str(<mview>, 'utf-8') # Treats mview as a bytes object.
<int> = int.from_bytes(<mview>, …) # `byteorder='little/big', signed=False`.
'<hex>' = <mview>.hex() # Treats mview as a bytes object.
Deque
A thread-safe list with efficient appends and pops from either side. Pronounced “deck”.
from collections import deque
<deque> = deque(<collection>, maxlen=None)
<deque>.appendleft(<el>) # Opposite element is dropped if full.
<deque>.extendleft(<collection>) # Collection gets reversed.
<el> = <deque>.popleft() # Raises IndexError if empty.
<deque>.rotate(n=1) # Rotates elements to the right.