My documented process https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence but honestly I just tinker with this. Most of that isn’t useful IMHO except some pieces, e.g STT/TTS, from time to time. The LLM aspect itself is too unreliable, and I do like 2 relatively recent papers on the topic, namely :
- No “Zero-Shot” Without Exponential Data https://arxiv.org/abs/2404.04125
- ChatGPT is bullshit https://link.springer.com/article/10.1007/s10676-024-09775-5
which are respectively saying that the long-tail makes it practically impossible to train AI to be correct in rare cases and that “hallucinations” are a misnomer for marketing purposes to be replaced instead by “bullshit” used to convinced people without caring for veracity.
Still, despite all this criticism it is a very popular topic, hyped up to be the “future” of computing. Consequently I did want to both try and help others to do so rather than imagine that it was restricted to a kind of “elite”. I try to keep the page up to date but so far, to be honest, I do it mostly defensively, to be able to genuinely criticize because I did take the time to try, not reject in block.
PS: I do try also state of the art, both close and open-source, via APIs e.g OpenAI or Mistral but only for evaluation purposes, not as tools part of my daily usage.
It is, if you are not ready to tinker I do not recommend it.
Yet, it works as-is, assuming you don’t need to work wirelessly. I use it on a nearly daily basis and it’s stable.