top of page

Another AI Performance Breakthrough?


New approach shows a reduction in AI processing costs of up to 90%
New approach shows a reduction in AI processing costs of up to 90%

And not where you would expect it coming from, like OpenAI, Google, etc. This time it is from Zoom (once again, a Chinese based breakthrough). If proven, the recent paper and published results by engineers at Zoom indicate potential tokenization load decreasing up to 92%, in an approach they call Chain of Draft (CoD).


Specifically, this new approach is not a code modification but a novel prompting strategy that aligns with human cognitive processes by prioritizing efficiency and minimalism in large language models (LLMs). The results could be as high as a 90% reduction in AI costs.


The implications here are tremendous, as in the January DeepSeek market impact in markets around the world. The potential to democratize access to sophisticated AI capabilities for smaller organizations and resource-constrained environments, with no retraining required for immediate business impact.


  • CoD encourages LLMs to generate concise, dense-information outputs at each step, reducing latency and computational costs without sacrificing accuracy.

  • Experiments demonstrate that CoD can achieve significantly reduced latency and cost compared to the standard Chain of Thought (CoT) approach, while maintaining or improving accuracy across various reasoning tasks.


  • CoD is compared to Chain-of-Thought (CoT) prompting and standard prompting, demonstrating significant reductions in latency and token count, with comparable or superior accuracy.

  • Evaluations using CoD show promising results in arithmetic reasoning, commonsense reasoning, and symbolic reasoning tasks, with models such as GPT-4o and Claude 3.5 Sonnet achieving high accuracy with reduced token counts and latency.

  • CoD enables large language models to solve problems with minimal words, using as little as 7.6% of the text required by current methods, while maintaining or improving accuracy.

The research team at Zoom Communications, led by Silei Xu, have provided paper preprints are available on arXiv with IDs arXiv:2310.03965 and arXiv:2309.08168.

 
 
 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page