Inference support on HF
Having free and commercial inference providers available is a great start*. Could we also get inference support working on Hugging Face?
Vote up this thread to boost interest.
* One of the criteria for the LTA evaluation
it's today available already on many public clouds, such as Azure, AWS, Excoscale, Phoenix and more.
there is also a public API here platform.publicai.co
it's also already supported for running on device with LM studio and MLX-LM
inference platforms supported so far: vLLM, SGlang, transformers
Thanks @mjaggi for the summary, that's all great - but the criteria refers specifically to an open playground accessible directly from the Model Card, like this one from Llama:
It would be fine to just have the smaller 8B model available. I'm discussing this with the HF team already and will close the issue as soon as it is possible to at least launch an Inference Endpoint without any workarounds.
public AI has now become an official huggingface inference provider π
https://huggingface.co/blog/inference-providers-publicai
