Chinese ebook reader Boox ditches GPT for state-censored China LLM pushing propaganda

tardigrada@beehaw.org · 1 year ago

Chinese ebook reader Boox ditches GPT for state-censored China LLM pushing propaganda

hersh · 1 year ago

That’s pretty much what I do, yeah. On my computer or phone, I split an epub into individual text files for each chapter using pandoc (or similar tools). Then after I read each chapter, I upload it into my summarizer, and perhaps ask some pointed questions.

It’s important to use a tool that stays confined to the context of the provided file. My first test when trying such a tool is to ask it a general-knowledge question that’s not related to the file. The correct answer is something along the lines of “the text does not provide that information”, not an answer that it pulled out of thin air (whether it’s correct or not).

Jayjader@jlai.lu · 1 year ago

Ooooh, that’s a good first test / “sanity check” !

May I ask what you are using as a summarizer? I’ve played around with locally running models from huggingface, but never did any tuning nor straight-up training “from scratch”. My (paltry) experience with the HF models is that they’re incapable of staying confined to the given context.