By Christian Prokopp on 2023-11-23
Recently, OpenAI released GPT-4 turbo preview with 128k at its DevDay. That addresses a serious limitation for Retrieval Augmented Generation (RAG) applications, which I described in detail for Llamar.ai. That amounts to nearly 200 pages of text, assuming approximately 500 words per page and 0.75 words per token and 2¹⁷ tokens.
While writing some code using the gpt-4-1106-preview
model via the API, I noticed that long responses never exceed 4,096 tokens for completion. Responses cut off mid-sentence or word even when the total is less than 128k tokens, i.e. input plus completion. A quick search in the OpenAI forum revealed that others observe this behaviour, and the model does not provide more than 4,096 completion tokens.
The larger context window greatly improves maintaining context in lengthy conversations. RAG applications can benefit from more detailed in-context learning and a higher chance of having relevant text in-context. However, this is a serious limitation for other applications that require extensive outputs, for example in data generation or conversion. I wish OpenAI were more proactive in listing the limitations in their DevDay announcement or the API description. OpenAI does mention it in its documentation in the model description.
Lastly, it is a reminder to never assume, always check and use logs and metrics whenever possible. Biases and issues can creep in from unexpected vectors.
Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at christian@bolddata.biz for inquiries.
2024-03-14
Tax Shrink is a new online tool that helps owner-operators of Limited companies in the UK calculate and visualise the ideal salary-to-dividend rati...
2023-11-09
Today, I received access to the new custom GPT feature on ChatGPT, and it appears to do what Sam Altman demonstrated. The implications are far-reac...
2023-02-11
Microsoft could follow Google's $100bn loss. I tried the new Bing Chat (ChatGPT) feature, which was great until it went disastrously wrong. It even...
2023-02-02
Programming with ChatGPT using an iterative approach is difficult, as I have demonstrated previously. Maybe ChatGPT can benefit from Test-driven de...
2023-01-25
ChatGPT and similar language models have recently been gaining attention for their potential to revolutionise code generation and enhance developer...
2022-08-08
There is one simple thing most companies miss about their data. It has been instrumental in my work as a data professional ever since.