So you just finished setting up a Hugging Face endpoint and went well that was easy now I wonder how hard it would be to test it on my phone? Turns out very easy once the app with the needed configuration settings is found. After installing close to a dozen from the Apple store I finally came across Mollama that would allow a custom API url along with a key.

Go to settings and add a new model provider under custom. Under the Base URL enter your endpoints URL and add a /v1 at the end. Then enter your API key. Make sure to only enable read only permissions to the Inference Endpoint for the new API key by only checking “Make calls to your Inference Endpoints”. Tap Models and under Model ID enter a / and then whatever model name of your choosing. And you’re done.

PS the app was made to work with OpenRouter and Ollama.