Making Raw SQL Easier to Read with Row Factory

21.04.2025 20:51 python sqlite postgre

One thing that annoy me when using raw query in Python is how the returned value is not “easy to remember”. Look at this query:

import sqlite3
con = sqlite3.connect("library.sqlite3")
query = "SELECT id, isbn, title FROM books"
con = con.execute(query)
rows = con.fetchall()

To print the returned value I must remember the index of each field:

for r in rows:
	print(f"{r[0]} - {r[1]} - {r[2]}")

# index 0 = id
# index 1 = isbn
# index 2 = title

1 - 1234567890 - Book 1 by Author 1
2 - 2345678901 - Book 1 by Author 1
3 - 3456789012 - Book 1 by Author 1
4 - 4567890123 - Book 1 by Author 2
5 - 5678901234 - Book 1 by Author 2

And if you think it’s only coming from sqlite, you’re wrong. I tried PostgreSQL with psycopg and it has same behavior:

import psycopg2
conn = psycopg2.connect()
cursor = conn.cursor()
query = "SELECT id, isbn, title FROM books"

cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
    print(f"{r[0]} - {r[1]} - {r[2]}")

So is using ORM the answer? Luckily, it’s not the only answer. After reading the docs, I found sqlite and postgre lib have something called row factory.

Sqlite3.Row

import sqlite3
con = sqlite3.connect("library.sqlite3")
con.row_factory = sqlite3.Row  # add this

query = "SELECT id, isbn, title FROM books"
con = con.execute(query)
rows = con.fetchall()

for r in rows:
	print(f"{r['id']} - {r['isbn']} - {r['title']}")

Psycopg2.DictCursor

import psycopg2
from psycopg2 import extras

conn = psycopg2.connect()
cursor = conn.cursor(cursor_factory=extras.DictCursor)
query = "SELECT id, isbn, title FROM books"

cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
    print(f"{r['id']} - {r['isbn']} - {r['title']}")

Or you can return result like ORM using dataclass (for sqlite or psycopg3) or NamedTuple (for psycopg2):

sqlite with Dataclass

import sqlite3
from dataclasses import dataclass

@dataclass
class Books:
	id: int
	isbn: str
	title: str

def converter_books(cursor, row):
	fields = [column[0] for column in cursor.description]
	return Books(**{k: v for k, v in zip(fields, row)})

con = sqlite3.connect("library.sqlite3")
con.row_factory = converter_books

query = "SELECT id, isbn, title FROM books"
con = con.execute(query)
rows = con.fetchall()

for r in rows:
	print(f"{r.id} - {r.isbn} - {r.title}")

Psycopg2 With NamedTuple

import psycopg2
from psycopg2 import extras

conn = psycopg2.connect()
cursor = conn.cursor(cursor_factory=extras.NamedTupleCursor)
query = "SELECT id, isbn, title FROM books"

cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
    print(f"{r.id} - {r.isbn} - {r.title}")

Or another version of psycopg

Psycopg3 With Dataclass

import psycopg
from dataclasses import dataclass
from psycopg.rows import class_row

@dataclass
class Books:
    id: int
    isbn: str
    title: str

conn = psycopg.connect()
cursor = conn.cursor(row_factory=class_row(Books))
query = "SELECT id, isbn, title FROM books"

cursor.execute(query)
rows = cursor.fetchall()
for r in rows:
    print(f"{r.id} - {r.isbn} - {r.title}")

So which one is better?

I tried small benchmark and found out:

sqlite

Default:
11946 function calls in 0.157 seconds

sqlite3.Row:
11946 function calls in 0.165 seconds

Dataclass:
52279 function calls in 0.231 seconds

Psycopg

Default:
78624 function calls in 0.521 seconds

DictRow:
288642 function calls in 0.578 seconds

Dataclass/class_row:
288983 function calls in 0.569 seconds

So for sqlite I prefer sqlite3.Row, and for postgre I think class_row/dataclass is good enough.