In early 2020, it became really difficult to read books. I'm slowly starting to get past that at last, but while Goodreads was super fun for a few years, right now tracking and counting and rating just makes reading feel like a chore and Not Fun as a hobby.
I still would like to keep a record of what I read, though. And I used to write book reviews on this blog before getting onto Goodreads. It seems like a nice way to keep a record of what I read without it becoming a number thing. I'll probably still write Goodreads reviews for small authors since I know it can help, but I feel less pressed about having everything there.
I also wanted to save the reviews I wrote only on Goodreads here, and wrote a script to migrate my reviews into Pelican-friendly Markdown pages since that's what powers this blog now. I decided to keep to a single entry per year rather than one entry per book, since I had a few good reading years in there (and others will only a single review!)
Step 1: CSV export of the books
First, if you go to 'My books' and find the 'Tools' menu at the bottom of the leftside menu on Goodreads, you'll find a page to export your library. You may have to try a couple of times: my first export only had a handful a books, the second one looks more comprehensive although the number of books was off by two, but what can you do.
Step 2: Python script to create the Markdown pages
This is the script I wrote to extract only the books I've actually read. I don't really care about DNF (did not finish) and to-read, right now. I also hardcoded the years relevant to me. May someone find something helpful in here!
import csv
from dataclasses import dataclass
from datetime import date, datetime
@dataclass
class Review:
title: str
author: str
date_read: date
review: str
rating: int
def get_reviews(year):
reviews = []
with open('goodreads_library_export.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
try:
r = Review(row['Title'],
row['Author'],
row['Date Read'],
row['My Review'],
int(row['My Rating']))
except ValueError:
# When the int() cast fails, usually it means the CSV is
# corrupted for that line. In my case, it was for a few to-read
# records so I ignore them rather than attempt to fix the
# original CSV. You can print the row here if you want to check
# what's failing.
pass
if year is None or year in r.date_read:
reviews.append(r)
return reviews
def rating_or_review(review):
# If I wrote a review, return that
if review.review:
return review.review
# Otherwise, make the rating into words.
if review.rating >= 4:
return "I really enjoyed it."
elif review.rating == 3:
return "It was fine."
else:
return "Wasn't for me."
def format_reviews(reviews, year):
with open(f'book-reviews-{year}.md', 'w') as f:
f.write(f"Title: Book reviews: Year {year}\n")
f.write(f"Date: {datetime.now().isoformat()}\n")
f.write("tags: book review\n\n")
for r in reviews:
# A couple of abandoned books sneaked in with a '0' rating, and I'm
# not interested in preserving those
if r.rating != 0:
f.write(f"## {r.title} by {r.author}\n\n")
f.write(f"{rating_or_review(r)}\n")
f.write("\n")
for year in range(2013, 2023):
reviews = get_reviews(str(year))
# Chronological order
reviews = sorted(reviews, key=lambda r: r.date_read)
format_reviews(reviews, year)
The hardest part was probably to decide what text to convert a rating into, since I didn't want to keep numbers!
Step 3: Checking the output looks right and recalling fond memories
I used the 'Year in Books'
pages on Goodreads to
compare the results. There was some funkiness sometimes, like a book read in
2011 showing in year in books but without any shelves and a date read showing
as 2020, even though I don't remember messing with it. The review also shows as
Jan 2020 on the Goodreads UI despite appearing in the correct 'Year in
books'. 2020 turned out to be date_added
(which is definitely false) while
the date_read
field is empty. Maybe some data migration funkiness on the
Goodreads side at some point during the last 12 years. Otherwise, a duplicate
once, and a couple of intra-Goodreads links that didn't work.
I still have to clean up the file for 2022. I was getting annoyed with tracking myself so I didn't write reviews, but if it's for the blog I wouldn't mind adding a few notes. And I need to decide if I want to post my 2023 reviews as I go, or batch them in some way!