# Example of how to use Llama-3.1-8B-Instruct API in python ## With a simple HTTP client (requests) First install the *requests* library: ```bash pip install requests ``` Next, export your access token to the *OVH_AI_ENDPOINTS_ACCESS_TOKEN* environment variable: ```bash export OVH_AI_ENDPOINTS_ACCESS_TOKEN= ``` *If you do not have an access token key yet, follow the instructions in the [AI Endpoints – Getting Started](https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-endpoints-getting-started?id=kb_article_view&sysparm_article=KB0065401).* Finally, run the following Python code: ```python import os import requests headers = { "Content-Type": "application/json", "Authorization": f"Bearer {os.getenv('OVH_AI_ENDPOINTS_ACCESS_TOKEN')}", } # With /chat/completions url = "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/chat/completions" payload = { "max_tokens": 512, "messages": [ { "content": "Explain gravity to a 6 years old", "role": "user" } ], "model": "Llama-3.1-8B-Instruct", "temperature": 0, } response = requests.post(url, json=payload, headers=headers) if response.status_code == 200: # Handle response response_data = response.json() # Parse JSON response choices = response_data["choices"] for choice in choices: text = choice["message"]["content"] # Process text and finish_reason print(text) else: print("Error:", response.status_code, response.text) # With /responses url = "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1/responses" payload = { "max_output_tokens": 512, "input": [ { "content": "Explain gravity to a 6 years old", "role": "user" } ], "store": False, "model": "Llama-3.1-8B-Instruct", "temperature": 0, } response = requests.post(url, json=payload, headers=headers) if response.status_code == 200: # Handle response response_data = response.json() text = response_data["output"][0]["content"][0]["text"] print(text) else: print("Error:", response.status_code, response.text) ``` ## With the Python OpenAI library The Llama-3.1-8B-Instruct API is compatible with the OpenAI specification. First install the *openai* library: ```bash pip install openai ``` Next, export your access token to the *OVH_AI_ENDPOINTS_ACCESS_TOKEN* environment variable: ```bash export OVH_AI_ENDPOINTS_ACCESS_TOKEN= ``` Finally, run the following Python code: ```python import os from openai import OpenAI url = "https://oai.endpoints.kepler.ai.cloud.ovh.net/v1" client = OpenAI( base_url=url, api_key=os.getenv("OVH_AI_ENDPOINTS_ACCESS_TOKEN") ) def chat_completion(new_message: str) -> str: history_openai_format = [{"role": "user", "content": new_message}] return client.chat.completions.create( model="Llama-3.1-8B-Instruct", messages=history_openai_format, temperature=0, max_tokens=1024 ).choices.pop().message.content def responses(new_message: str) -> str: response = client.responses.create( model="Llama-3.1-8B-Instruct", input=new_message, temperature=0, max_output_tokens=1024, store=False ) return response.output[0].content[0].text if __name__ == '__main__': # With chat completion endpoint print(chat_completion("Explain gravity for a 6 years old")) # With responses endpoint print(responses("Explain gravity for a 12 years old")) ``` ## Model rate limit When using AI Endpoints, the **following rate limits apply**: - **Anonymous**: 2 requests per minute, per IP and per model. - **Authenticated with an API access key**: 400 requests per minute, per Public Cloud project and per model. If you exceed this limit, a **429 error code** will be returned. If you require higher usage, please **[get in touch with us](https://help.ovhcloud.com/csm?id=csm_get_help)** to discuss increasing your rate limits. ## Going Further Want to explore the full capabilities of the LLM API? Dive into our dedicated [Structured Output](https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-endpoints-structured-output?id=kb_article_view&sysparm_article=KB0071891) and [Function Calling](https://help.ovhcloud.com/csm/en-gb-public-cloud-ai-endpoints-function-calling?id=kb_article_view&sysparm_article=KB0071907) guides. For a broader overview of AI Endpoints, explore the full [AI Endpoints Documentation](https://help.ovhcloud.com/csm/en-gb-documentation-public-cloud-ai-and-machine-learning-ai-endpoints?id=kb_browse_cat&kb_id=574a8325551974502d4c6e78b7421938&kb_category=ea1d6daa918a1a541e11d3d71f8624aa). Reach out to our support team or join the [OVHcloud Discord](https://discord.gg/ovhcloud) #ai-endpoints channel to share your questions, feedback, and suggestions for improving the service, to the team and the community.