Out of Sorts, take 4

September 2016 updates: major-league Python code tidying (and new data listings).

Having, erm, sorted out book list sorting back in January 1989 (on my Amstrad CP/M PC), and, a year later book list sorting (under Acorn RISC OS), how about some really modern book list sorting, undertaken with the help of a SQLite DB and some adroit Python snake charming? On a 64-bit Linux Mint system?

I've been teased...

... by Brian for my Luddite approach to data file handling. He advocates Python for text field handling... basically. For the last umpty¹ years I've kept ASCII data files of my books, videos, and music. Until 1994, I would occasionally print these out as jolly nice DTP catalogues that the print room staff in the basement of IBM Hursley house were happy to copy, collate, and bind for me in return for small-scale financial inducements, a few good jokes, or just some of my more scandalous gossip.

What happened in 1994? Basically, the web.

Since the books I'd been writing in IBM all used a markup language, HTML was just another set of tags to handle. With less paper² to print. So my book lists lived on in cyberspace. With the block and column edit facilities of the TextPad editor on Windows I would sort them by Author, by Title, by Genre, by price, by date of purchase... and then whack them in between a pair of <pre> and </pre> tags, with a bit of HTML / CSS wrappering, letting the web browser do the work. Tab-separated data. What could be easier?

"Why do it that way?" asked Brian, horrified at the thought of treating data in such a haphazard, cavalier way. (He was too polite to say "amateurish", but I've known him long enough to know that's what he thought. And he's right.) "Show me a better way", said I. He did.

I now have a SQLite DB that I can enter data directly into, or load from an ASCII data file by Python code. Further code extracts, sorts, and formats the data (with some CSS assistance from me). It handles all the special cases of authors whose names contain accents, or who insist on a surname beginning with a lowercase³ letter, or book titles similarly afflicted. The generated web pages are indistinguishable from the hand-crafted horrors I've laboriously produced for two decades. But with one crucial difference. These web pages take mere seconds of CPU time to generate. Plus a bit of typing at a command prompt. Using TextPad on Windows, it could take me an hour or more to prepare the same⁴ files.

Date: 24 September 2016

See how they run!

A recent example, run on 24th September, 2016 on Skylark:

Footnotes

¹  In the case of books, "umpty" is an integer number of years approaching 50 though the lists were (as mentioned here) originally hand-written in notebooks, and later listed on a card index system of Baroque complexity. I was an odd child.
²  I was greatly amused (though not in a good way) when summoned on one occasion up to the Lab Director's office and told, on arrival, to explain to her secretary the art of printing out web pages so that the Lab Director could then read them. Indeed, I have often been greatly amused by the evidence I've seen with my own eyes of senior IBMers who variously failed to engage with new-fangled technology of various sorts. Not that I'm squeaky clean in this respect. (Or any other, for that matter.)
³  Author ee cummings managed a Doubleplusungood with both his name, and his book title: "may i feel said he". As for John Cuneo's "nEuROTIC" (a lovely little book of vicious cartoons), what can one say?
⁴  In the mid-1980s, sorting and merging my books database on CP/M, using Mallard BASIC, it took 20 minutes just to chew through 5,000 titles. Let's not even think about how long the subsequent noisy impact printing took. Both disk and memory were at such a premium back then I was forced to keep three separate data files of my books: fiction, non-fiction, and science fiction. Hence the need for some adroit merging.