When calling models where max_output_token is much smaller than max_prompt_token, the output may sometimes be truncated. During truncation, if the agent's output ends with a space, it might cause abnormal responses from the model side.

### What happened?

When a LLM's max_output_token is much smaller than max_prompt_token, during certain generation tasks such as coding and writing, the output may be truncated. Truncation might occur at spaces or line breaks, and in such cases, the original message will be added to the model_context. One solution to handle truncation is to add the output to the context as a prompt, allowing the LLM to continue generating. However, this approach may lead to the previously mentioned issues, for example:

openai.BadRequestError: Error code: 400 - {'success': False, 'message': '&#27169;&#22411;&#25552;&#20379;&#26041;&#38169;&#35823;', 'data': None, 'code': 'MPE-001', 'detailMessage': '{"error": {"type": "llm_call_failed", "message": "{\\"type\\":\\"error\\",\\"error\\":{\\"type\\":\\"invalid_request_error\\",\\"message\\":\\"messages: final assistant content cannot end with trailing whitespace\\"}}\\n", "treace_id": "213e066e17521531374748281e7f22"}}, traceId: 213e066e17521531374748281e7f22'}

Would you consider the following solution to address this issue?

<img width="2278" height="1718" alt="Image" src="https://github.com/user-attachments/assets/6b9f0739-f191-46e7-9d05-d9ee3f86f0e6" />



### Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

### AutoGen library version.

Python 0.6.4

### Other library version.

_No response_

### Model used

_No response_

### Model provider

None

### Other model provider

_No response_

### Python version

None

### .NET version

None

### Operating system

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When calling models where max_output_token is much smaller than max_prompt_token, the output may sometimes be truncated. During truncation, if the agent's output ends with a space, it might cause abnormal responses from the model side. #6790

What happened?

Which packages was the bug in?

AutoGen library version.

Other library version.

Model used

Model provider

Other model provider

Python version

.NET version

Operating system

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

When calling models where max_output_token is much smaller than max_prompt_token, the output may sometimes be truncated. During truncation, if the agent's output ends with a space, it might cause abnormal responses from the model side. #6790

Description

What happened?

Which packages was the bug in?

AutoGen library version.

Other library version.

Model used

Model provider

Other model provider

Python version

.NET version

Operating system

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions