Description
What happened?
When a LLM's max_output_token is much smaller than max_prompt_token, during certain generation tasks such as coding and writing, the output may be truncated. Truncation might occur at spaces or line breaks, and in such cases, the original message will be added to the model_context. One solution to handle truncation is to add the output to the context as a prompt, allowing the LLM to continue generating. However, this approach may lead to the previously mentioned issues, for example:
openai.BadRequestError: Error code: 400 - {'success': False, 'message': '模型提供方错误', 'data': None, 'code': 'MPE-001', 'detailMessage': '{"error": {"type": "llm_call_failed", "message": "{\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"messages: final assistant content cannot end with trailing whitespace\"}}\n", "treace_id": "213e066e17521531374748281e7f22"}}, traceId: 213e066e17521531374748281e7f22'}
Would you consider the following solution to address this issue?

Which packages was the bug in?
Python AgentChat (autogen-agentchat>=0.4.0)
AutoGen library version.
Python 0.6.4
Other library version.
No response
Model used
No response
Model provider
None
Other model provider
No response
Python version
None
.NET version
None
Operating system
None