Histories.ai is an open-source research project by Archie McKenzie, a student at Princeton University.
Herodotus' Histories, written ~430BC, is a sweeping account of events in Europe, North Africa, and West Asia stretching from the Trojan War to 479 BC. Histories is the first work of written history at scale – indeed, we derive the term "history" from Herodotus' use of the word ἱστορίη, meaning "inquiry". It is also a travelogue, a geography book, and above all, the world's greatest collection of rumors.
Herodotus relates an outrageous heist from the Pharaoh's treasury, the customs of blood-drinking steppe nomads, and the ecology of flying snakes just as readily as he reports on war and politics.
As Tom Holland writes in the preface to his translation:
Herodotus is the most entertaining of historians. Indeed, he is as entertaining as anyone who has ever written – historian or not.
Histories.ai is a dissection the foundational work of history with large language models, primarily OpenAI's GPT-4. It is an interactive, AI-powered translation of all nine books:
- GPT-4 wrote the English counterpart to the Greek text. This was done through a two-step process:
GPT-4's translation is structurally close to the underlying Greek – excellent for comparing the Greek and English side-by-side. It is a faithful, but unrefined translation, not a fluid, idiomatic retelling. So, I expect it to be most useful to those studying Greek or reading the Histories as scholars.
- Asking GPT-4 to summarize each chapter in the A.D. Godley (1920) English translation.
- Reprompting GPT-4 with that summary and the raw Greek text to produce an English translation for each chapter, taking into account previous chapters' context.
- Semantic search is powered by vector embeddings. Every chapter has been vectorized using the ada-002-embeddings model, and is compared with arbitrary queries like "prophetic dreams" using cosine similarity. The most similar chapters are then returned.
- On-demand parsing is done by GPT-3.5. Click on a Greek word and gpt-3.5-turbo will parse it for you, telling you how it would be spelt in English, the part of speech it is, its meaning, and its dictionary form. (Sometimes it can take a while if the word hasn’t been clicked before, and occasionally it fails.) Once the word has been parsed, it is stored in a database for easy retrieval, so it should be faster the second time.
See the FAQ for tips on how to structure search queries. (Or just loiter around the landing page for a while.)
Language models are probabilistic; they make mistakes. If you encounter a mistake, whether it's down to artificial or human incompetence, please let me know. If you haven't found a mistake but just want to chat about Herodotus, feel free to reach out!
Read the code which produced the translation here. Initial data was sourced from the Perseus Digital Library.
All text on Histories.ai is offered under the most permissive license possible. Everything produced by language models is dedicated to the public domain.