Pricing (2024)

Chat completionrequests are billed based on the number of input tokens sent plus the number of tokens in the output(s) returned by the API.

Your request may use up to num_tokens(input) + [max_tokens * max(n, best_of)] tokens, which will be billed at the per-engine rates outlined at the top of this page.

In the simplest case, if your prompt contains 200 tokens and you request a single 900 token completion from the gpt-3.5-turbo-1106 API, your request will use 1100 tokens and will cost [(200 * 0.001) + (900 * 0.002)] / 1000 = $0.002.

You can limit costs by reducing prompt length or maximum response length, limiting usage ofbest_of/n , adding appropriate stop sequences, or using engines with lower per-token costs.

Pricing (2024)
Top Articles
Latest Posts
Article information

Author: Francesca Jacobs Ret

Last Updated:

Views: 6316

Rating: 4.8 / 5 (48 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Francesca Jacobs Ret

Birthday: 1996-12-09

Address: Apt. 141 1406 Mitch Summit, New Teganshire, UT 82655-0699

Phone: +2296092334654

Job: Technology Architect

Hobby: Snowboarding, Scouting, Foreign language learning, Dowsing, Baton twirling, Sculpting, Cabaret

Introduction: My name is Francesca Jacobs Ret, I am a innocent, super, beautiful, charming, lucky, gentle, clever person who loves writing and wants to share my knowledge and understanding with you.