Making Raw SQL Easier to Read with Row Factory
python sqlite postgreOne thing that annoy me when using raw query in Python is how the returned value is not “easy to remember”. Look at this query:
import sqlite3
con = sqlite3.connect("library.sqlite3")
query = "SELECT id, isbn, title FROM books"
con = con.execute(query)
rows = con.fetchall()
To print the returned value I must remember the index of each field:
for r in rows:
print(f"{r[0]} - {r[1]} - {r[2]}")
# index 0 = id
# index 1 = isbn
# index 2 = title
1 - 1234567890 - Book 1 by Author 1
2 - 2345678901 - Book 1 by Author 1
3 - 3456789012 - Book 1 by Author 1
4 - 4567890123 - Book 1 by Author 2
5 - 5678901234 - Book 1 by Author 2
And if you think it’s only coming from sqlite, you’re wrong. I tried PostgreSQL with psycopg and it has same behavior:
import psycopg2
conn = psycopg2.connect()
cursor = conn.cursor()
query = "SELECT id, isbn, title FROM books"
cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
print(f"{r[0]} - {r[1]} - {r[2]}")
So is using ORM the answer? Luckily, it’s not the only answer. After reading the docs, I found sqlite and postgre lib have something called row factory.
Sqlite3.Row
import sqlite3
con = sqlite3.connect("library.sqlite3")
con.row_factory = sqlite3.Row # add this
query = "SELECT id, isbn, title FROM books"
con = con.execute(query)
rows = con.fetchall()
for r in rows:
print(f"{r['id']} - {r['isbn']} - {r['title']}")
Psycopg2.DictCursor
import psycopg2
from psycopg2 import extras
conn = psycopg2.connect()
cursor = conn.cursor(cursor_factory=extras.DictCursor)
query = "SELECT id, isbn, title FROM books"
cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
print(f"{r['id']} - {r['isbn']} - {r['title']}")
Or you can return result like ORM using dataclass (for sqlite or psycopg3) or NamedTuple (for psycopg2):
sqlite with Dataclass
import sqlite3
from dataclasses import dataclass
@dataclass
class Books:
id: int
isbn: str
title: str
def converter_books(cursor, row):
fields = [column[0] for column in cursor.description]
return Books(**{k: v for k, v in zip(fields, row)})
con = sqlite3.connect("library.sqlite3")
con.row_factory = converter_books
query = "SELECT id, isbn, title FROM books"
con = con.execute(query)
rows = con.fetchall()
for r in rows:
print(f"{r.id} - {r.isbn} - {r.title}")
Psycopg2 With NamedTuple
import psycopg2
from psycopg2 import extras
conn = psycopg2.connect()
cursor = conn.cursor(cursor_factory=extras.NamedTupleCursor)
query = "SELECT id, isbn, title FROM books"
cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
print(f"{r.id} - {r.isbn} - {r.title}")
Or another version of psycopg
Psycopg3 With Dataclass
import psycopg
from dataclasses import dataclass
from psycopg.rows import class_row
@dataclass
class Books:
id: int
isbn: str
title: str
conn = psycopg.connect()
cursor = conn.cursor(row_factory=class_row(Books))
query = "SELECT id, isbn, title FROM books"
cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
print(f"{r.id} - {r.isbn} - {r.title}")
So which one is better?
I tried small benchmark and found out:
sqlite
Default:
11946 function calls in 0.157 seconds
sqlite3.Row:
11946 function calls in 0.165 seconds
Dataclass:
52279 function calls in 0.231 seconds
Psycopg
Default:
78624 function calls in 0.521 seconds
DictRow:
288642 function calls in 0.578 seconds
Dataclass/class_row:
288983 function calls in 0.569 seconds
So for sqlite I prefer sqlite3.Row, and for postgre I think class_row/dataclass is good enough.