I’m writing to record my process such that I may revisit it if I need to.
On my linux box home desktop I installed – Audiblez
https://github.com/santinic/audiblez/
I can’t get the gui interface to work, but the command line interface is easy.
audiblez book.epub -v af_heart
I give audiblez an epub – a fanfic from archiveofourown and I tell it which voice to use – I’ve tried af_heart [american female voice] and bm_george [british male voice], both are fun. I’m leaning more towards heart after listing to 12+ hours of george but next I’m going to try bella as that one is highly rated.
The program is using Kokoro-82 which is a text-to-speech model – https://huggingface.co/hexgrad/Kokoro-82M – I like that the model includes information about the source of their training data (all open) and training costs.
To play the resulting files on my phone I’m using voice from f-droid
https://f-droid.org/en/packages/de.ph1b.audiobook
The files that audiblez make are hefty – but I can use OPUS, a narration compression format in Voice
I take the m4b made by audiblez and use ffmpeg to make an opus file
ffmpeg -i 'input.m4b' -c:a libopus -b:a 128k output.opus
A 821.5 kB epub becomes a 8.3 GB m4b becomes a 1.4 GB opus.
What I wish – is to spin up a storyteller instances for myself – https://storyteller-platform.gitlab.io/storyteller/ – and have my audiobooks sync up with my epubs so I can listen and pick back up with reading and move back and forth between reading and listening. I know audible has that tech but I’ve never used it. It seems cool.
If the app could convert the epub to opus / make the epub media overlay that storyteller uses, that would be very cool. I suspect before I figure that out the text to speech engine on my phone will get so good that I don’t need to pre-make my files. To be honest I used the built in text to speech on my phone without much complaint but that was before I got a taste for this new world that does sound so much better.
This post is rambling because it simply an attempt to document my process for making audiobooks. A reminder recipe more than anything else. It was inspired by using https://elevenlabs.io/ and it eating up all my mobile data and realizing, hey, I bet I could spin this up on my own computer, and I have! It takes a lot of hours to process audiblez on my desktop. I put a note on the computer not to turn it off. Getting a server up and running (one that doesn’t get turned off) is a good idea. Especially if it has better specs for doing the converting work.
I am enjoying audiobooks from fanfic. You can to! With the magic of FOSS. I’m not sharing what I make because I haven’t gotten permissions from any authors to do so.