Perform completion inference on the service | Elasticsearch Serverless API documentation

Perform completion inference on the service Generally available

POST /_inference/completion/{inference_id}

Get responses for completion tasks. This API works only with the completion task type.

IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

This API requires the monitor_inference cluster privilege (the built-in inference_admin and inference_user roles grant this privilege).

Path parameters

inference_id string Required

The inference Id

Query parameters

timeout string

Specifies the amount of time to wait for the inference request to complete.

External documentation

application/json

Body Required

input string | array[string] Required

Inference input. Either a string or an array of strings.

One of:
string-1 string array-2 array[string]
task_settings object

Task settings for the individual inference request. These settings are specific to the you specified and override the task settings specified when initializing the service.

Responses

200 application/json
Hide response attribute Show response attribute object
- completion array[object] Required
  
  The completion result object
  
  Hide completion attribute Show completion attribute object
  
  result string Required

POST /_inference/completion/{inference_id}

POST _inference/completion/openai_completions
{
  "input": "What is Elastic?"
}

resp = client.inference.completion(
    inference_id="openai_completions",
    input="What is Elastic?",
)

const response = await client.inference.completion({
  inference_id: "openai_completions",
  input: "What is Elastic?",
});

response = client.inference.completion(
  inference_id: "openai_completions",
  body: {
    "input": "What is Elastic?"
  }
)

$resp = $client->inference()->completion([
    "inference_id" => "openai_completions",
    "body" => [
        "input" => "What is Elastic?",
    ],
]);

curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"input":"What is Elastic?"}' "$ELASTICSEARCH_URL/_inference/completion/openai_completions"

client.inference().completion(c -> c
    .inferenceId("openai_completions")
    .input("What is Elastic?")
);

Request example

Run `POST _inference/completion/openai_completions` to perform a completion on the example question.

{
  "input": "What is Elastic?"
}

Response examples (200)

A successful response from `POST _inference/completion/openai_completions`.

{
  "completion": [
    {
      "result": "Elastic is a company that provides a range of software solutions for search, logging, security, and analytics. Their flagship product is Elasticsearch, an open-source, distributed search engine that allows users to search, analyze, and visualize large volumes of data in real-time. Elastic also offers products such as Kibana, a data visualization tool, and Logstash, a log management and pipeline tool, as well as various other tools and solutions for data analysis and management."
    }
  ]
}

Perform completion inference on the service Generally available

Path parameters

Query parameters

Body Required

input string | array[string] Required

Responses