使用 AutoGen 打造多 AI 工作流 — Sequential Chat 與 Nested Chat

Posted on  Sep 24, 2024  in  AutoGen  by  Amo Chen  ‐ 8 min read

使用 AutoGen 打造多 AI 工作流 — Two-Agent Chat 與 Group Chat 一文中介紹 two-agent chat 與 group chat 兩種對話模式,本文將接續介紹剩下 2 種對話模式:

  • Sequential Chats
  • Nested Chats

理解 AutoGen 的各種對話模式,將有助於處理較複雜的 workflow。

最後,在 Nested Chats 的程式碼範例中,我們將介紹如何使用 AutoGen 執行 Python 程式碼,並取得其執行結果與 agent 進行互動。

本文環境

$ pip install pyautogen

本文是 使用 AutoGen 打造多 AI 工作流 — Two-Agent Chat 與 Group Chat 一文的續作,如未閱讀該文,建議先閱讀該文。

Sequential Chats

The name of this pattern is self-explanatory – it is a sequence of chats between two agents, chained together by a mechanism called carryover, which brings the summary of the previous chat to the context of the next chat.

Sequential chat 是一種具有次序的對話模式,每一對 agents 之間的對話(也就是 two-agent chat),會被摘要之後,傳給下一對 agents 作為 context,以接續進行工作,這種把每對 agents 串起來的機制稱為 carryover,下圖是 AutoGen 官方文件針對 sequential chat 所描繪的流程圖:

sequential-two-agent-chat.png

上圖中有 2 個值得注意的點:

  • carryover 會不斷累積,所以後面的 agents 也能夠知道更先前 carryover(這些 carryover 其實會放在對話訊息內),這就是為何上圖下方仍有額外的箭頭指向更後面的 agents
  • 每一對 agents 都能設定不同的 message 做出不同的調整

舉例來說,使用 AutoGen 打造多 AI 工作流 — Two-Agent Chat 與 Group Chat 一文中的 group chat 範例,存在明顯的次序關係,所以可以改成使用 sequential chat,例如以下的程式碼範例,grammar_checker , writer , reviewer 3 個 agents 會按照順序 grammar_checker > writer > reivewer > writer 會各自與 manager 對答一次,最後改出一版簡潔易懂的 email:

sequential_chat_example.py

from autogen import ConversableAgent, GroupChat, GroupChatManager

llm_config = {
    'config_list': [
        {
            'model': 'llama3.1:8b',
            'base_url': 'http://localhost:11434/v1',
            'api_key': 'ollama', # 任意值皆可
            'price': [0, 0],
        }
    ],
}


manager = ConversableAgent(
    'Manager',
    llm_config=llm_config,
    system_message='You help me rewrite the email enclosed within the triple quotes.',
    human_input_mode='NEVER',
)


grammer_checker = ConversableAgent(
    'Grammar Checker',
    llm_config=llm_config,
    system_message='''
    You are a Grammar Checker.
    Your task is to carefully review the provided English text and identify any grammatical errors.
    For each error, explain what the mistake is and suggest the correct form.
    ''',
)

writer = ConversableAgent(
    'Writer',
    llm_config=llm_config,
    system_message='''
    You are a Writer.
    Your task is to take the provided English text and re-edit it to improve its clarity, flow, and readability while maintaining the original meaning.
    Ensure that the tone remains appropriate for the context.
    ''',
)

reviewer = ConversableAgent(
    'Reviewer',
    llm_config=llm_config,
    system_message='''
    You are a Reviewer.
    Your task is to analyze the given English text and point out any words or phrases that are not simple or clear enough.
    Suggest simpler alternatives that would be easier for a broader audience to understand.
    ''',
)

email = """
Hi,
This is Amo. Nice to e-meet you. Hope you are doing well.
I am writing this email to notify you that I might have a great AI project idea you would like.
If you are intresting, please let me know.
Best,
Amo
"""

manager.initiate_chats([
    {
        'recipient': grammer_checker,
	    'message': f'''
	    Please review the email enclosed in the triple quotes.
	    """
        {email}
	    """
	    ''',
        'max_turns': 1,
        'summary_method': 'last_msg',
    },
    {
        'recipient': writer,
	    'message': f'''
        Please rewrite the email enclosed in the triple quotes based on the suggestions.
        Email:
        """
        {email}
        """
        Please enclose the rewritten email in triple quotes.
        ''',
        'max_turns': 1,
        'summary_method': 'last_msg',
    },
    {
        'recipient': reviewer,
	    'message': f'''
        Please review the email enclosed within triple quotes.
        ''',
        'max_turns': 1,
        'summary_method': 'last_msg',
    },
    {
        'recipient': writer,
	    'message': f'''
        Please rewrite the email enclosed in triple quotes based on the suggestions.
        Please enclose the rewritten email in triple quotes.
        ''',
        'max_turns': 1,
        'summary_method': 'last_msg',
    },
])

上述程式碼的重點在於使用 ConversableAgentinitiate_chats() 方法(注意,是 chats,有加 s)對多個 agents 輪流發起對話,即參數中包含多個 dictionary 的 list,每 1 個 dictionary 其實都是 initiate_chat() 方法(注意,是 chat 而已)的參數,包含對哪個 agent 發起對話、訊息/prompt、結束對話條件、摘要的方法(summary_method)等等,而 'summary_method': 'last_msg' , 指的是直接將最後的訊息當作摘要,以當作 context 傳送到下一對 agents 對話中。

這樣就完成 sequential chat 了。

以下是上述程式碼的執行結果,可以看到最後做到按照修改的結果:

$ python sequential_chat_example.py
************
Starting a new chat....

************
Manager (to Grammar Checker):


	    Please review the email enclosed in the triple quotes.
	    """

Hi,
This is Amo. Nice to e-meet you. Hope you are doing well.
I am writing this email to notify you that I might have a great AI project idea you would like.
If you are intresting, please let me know.
Best,
Amo

	    """

--------------

>>>>>>>> USING AUTO REPLY...
Grammar Checker (to Manager):

Here's the review with suggested corrections:

The email has several grammatical errors. Here they are, along with their corrections:

1. "e-meet" should be "meet".
	* Mistake: Incorrect use of a prefix (in this case, "e-").
	* Suggested correction: "It's nice to meet you."
2. "intresting" should be "interested".
	* Mistake: Confusion between the correct forms of the word "interest" and its incorrect form.
	* Suggested correction: "If you are interested..."
3. The entire email is an informal letter, but considering the fact that Amo is introducing himself to someone he is contacting for the first time (implied by the phrase 'I might have a great AI project idea you would like'), it would be more professional if Amo made it clear why he was reaching out.

--------------

...(中間文長,故省略)...

--------------

>>>>>>>> USING AUTO REPLY...
Writer (to Manager):

Here is the rewritten email enclosed in triple quotes:

"""
Hi there,

It's nice to meet you. Hope you're doing well.

I'd love to share some AI project ideas with you, if you're interested.

If so, please let me know and we can discuss further. I'm excited that my concept might align with your interests.

Best regards,
Amo"""

I have addressed the points listed below:

* Simplified the opening by removing unnecessary words.
* Clarified the invitation to share ideas by rephrasing it for a broader audience.
* Removed the second transition sentence and simplified the one that remained to clearly communicate your intention.
* Replaced "collaborator" with "align with your interests."

--------------

Nested Chats

目前為止,我們已經接觸 3 種對話模式:

  • Two-agent chats
  • Group chats
  • Sequential chats

這 3 種對話模式都沒有硬性規定單一的窗口,也就是說我們可以直接與某個 agent 發起對話進行互動。

此章節將介紹最後一種 — Nested chats,這種對話模式屬於有單一窗口的模式,可以將某些工作流程(workflow)包裝到某個 agent,令該 agent 做為單一窗口,以達到重複利用工作流程的作用。

The previous conversations patterns (two-agent chat, sequential chat, and group chat) are useful for building complex workflows, however, they do not expose a single conversational interface, which is often needed for scenarios like question-answering bots and personal assistants. In some other cases, it is also useful to package a workflow into a single agent for reuse in a larger workflow. AutoGen provides a way to achieve this by using nested chats.

下圖是 AutoGen 針對 nested chat 所繪製的範例圖,初學者可能會感到複雜、困惑,但只要經過解釋就可以理解 nested chat 的運作:

nested-chat.png

上圖左邊的 Agent A 其實扮演的是右邊 Nested Chats 的窗口,我們無法直接與右側的 Nested Chats 進行互動,需要與 Agent A 發起對話,由 Agent A 內的 Trigger 判斷是否要使用 Nested Chats,如果判斷需要使用 Nested Chats,則進入右邊 Nested Chats 的流程,再將 Nested Chats 的結果傳回 Agents A,由 Agents A 負責回應訊息。

用 Nested Chats 修好猜數字遊戲

接下來,我們把使用 AutoGen 打造多 AI 工作流 — Two-Agent Chat 與 Group Chat 一文中的猜數字範例改成使用 nested chat 的方式,nested chat 中只有 1 個能夠執行 Python 程式碼的 DockerCommandLineCodeExecutor,我們使用 Python 程式比較數字大小,藉此修正 Llama 3.1 8B 對於數字大小比較能力不佳的問題。

修改後的猜數字範例的 agents 互動流程圖如下所示:

nested-chat-example.png

完整程式碼如下:

nested_chat_example.py

import re
import tempfile
from textwrap import dedent

from autogen import ConversableAgent
from autogen.coding import DockerCommandLineCodeExecutor

temp_dir = tempfile.gettempdir()

llm_config = {
    'config_list': [
        {
            'model': 'llama3.1:8b',
            'base_url': 'http://localhost:11434/v1',
            'api_key': 'ollama', # 任意值皆可
            'price': [0, 0],
        }
    ],
}

jack = ConversableAgent(
    'Jack',
    llm_config=llm_config,
    system_message='''
    You are playing a game called "Guess My Number."
    ''',
    is_termination_msg=lambda msg: '53' in msg['content'],
)

cathy = ConversableAgent(
    'Cathy',
    llm_config=llm_config,
    system_message='''
    I have a number in mind, and you will try to guess it.
    If I respond with "Too high", you should guess a lower number.
    If I respond with "Too low", you should guess a higher number.
    Please only answer the number which you guess no additional description is needed.
    ''',
)

executor = DockerCommandLineCodeExecutor(
    image="python:3.12-slim",  # Execute code using the given docker image name.
    timeout=10,  # Timeout for each code execution in seconds.
    work_dir=temp_dir,  # Use the temporary directory to store the code files.
)

code_executor_agent_using_docker = ConversableAgent(
    "code_executor_agent_docker",
    llm_config=False,  # Turn off LLM for this agent.
    code_execution_config={"executor": executor},  # Use the docker command line code executor.
    human_input_mode="NEVER",  # Notice: Always take human input for this agent for safety.
)

def generate_prompt_for_code_executor(sender, messages, recipient, context):
    n = messages[-1]['content']
    return dedent(f"""
    Execute the following script to verify {n} is bigger than 53.

    ```python
    print("Too High" if int("{n}") > 53 else "Too Low")
    ```
    """)

output_re = re.compile(r'Code output:\s([a-zA-Z ]*)\n', re.M)

def summarize(sender, recipient, summary_args):
    last_msg = recipient.last_message(sender)["content"]
    match = output_re.search(last_msg)
    if match:
        return match[1]
    return last_msg

nested_chats = [
    {
        "recipient": code_executor_agent_using_docker,
        "message": generate_prompt_for_code_executor,
        "summary_method": summarize,
        "max_turns": 1,
    },
]

jack.register_nested_chats(
    nested_chats,
    trigger=lambda sender: sender is cathy
)

result = jack.initiate_chat(cathy, message='I have a number between 1 and 100. Guess it!')

以下重點說明上述程式碼。

首先,我們定義 2 個 agents:

  • Jack,負責告訴 Cathy 是否猜中數字。如果沒猜中,則給予 Cathy 提示;猜中的話,就停止執行。
  • Cathy,負責猜數字。

接著,我們使用 AutoGen 內建專門執行程式碼的 DockerCommandLineCodeExecutor,該類別會在 Docker container 中執行我們指定的程式碼(或者對話訊息中的程式碼),最後輸出執行結果,我們使用 python:3.12-slim 版本的 image,並設定 work_dir 參數,作為暫存程式碼檔案的地方:

executor = DockerCommandLineCodeExecutor(
    image="python:3.12-slim",  # Execute code using the given docker image name.
    timeout=10,  # Timeout for each code execution in seconds.
    work_dir=temp_dir,  # Use the temporary directory to store the code files.
)

再來,我們需要建立 1 個 agent,使其具備執行程式碼的能力,也就是以下 code_execution_config 參數:

code_executor_agent_using_docker = ConversableAgent(
    "code_executor_agent_docker",
    llm_config=False,  # Turn off LLM for this agent.
    code_execution_config={"executor": executor},  # Use the docker command line code executor.
    human_input_mode="NEVER",  # Notice: Always take human input for this agent for safety.
)

上述程式碼需要注意 human_input_mode="NEVER" 參數設定,此處之所以設定為 NEVER 是為了 demo 順暢的緣故。

為了安全建議使用 human_input_mode="ALWAYS",其根本原因是我們無法信任 code executor 將要執行的程式碼,所以最好執行程式碼之前需要人類進行確認。

接著,設定 nested chat 的流程,此處只需要設定 code_executor_agent_using_docker,相關參數說明如下:

  • message,由於 code_executor_agent_using_docker 需要執行的程式碼,還需要動態加上 Cathy 所猜的數字,所以此處使用 1 個 callable generate_prompt_for_code_executor() 取得 Cathy 的訊息以動態產生程式碼,也就是 generate_prompt_for_code_executor() 函式中的 n = messages[-1]['content'] 可以取得 Cathy 最近一次的對話訊息。

  • max_turns,由於它僅需要執行一次程式碼就可以知道數字太大還是太小,所以 max_turns 設定為 1。

  • summary_method,由於 Jack 會回傳 code_executor_agent_using_docker 的執行結果作為回應,我們希望單純回應 Too Low 或 Too High 就好,所以此處也設定 1 個 callable summarize() 取得 code_executor_agent_using_docker 的執行結果,並用正規表示式取得需要的字串作為回應;如果不使用 callable 的話,回傳的訊息會是類似 exitcode: 0 (execution succeeded)\nCode output: Too High\n 的字串,如不修改就有可能會影響 Cathy 的回應內容。

nested_chats = [
    {
        "recipient": code_executor_agent_using_docker,
        "message": generate_prompt_for_code_executor,
        "summary_method": summarize,
        "max_turns": 1,
    },
]

再來為 Jack 註冊 1 個 nested chat,並且 trigger 設定為 sender 為 Cathy 就能使用 nested chat:

jack.register_nested_chats(
    nested_chats,
    trigger=lambda sender: sender is cathy
)

最後發起與 Cathy 之間的對話,開始猜數字遊戲:

result = jack.initiate_chat(cathy, message='I have a number between 1 and 100. Guess it!')

如此一來,Jack 就具備正確判斷數字大小的能力。

上述程式碼執行部分結果如下,從結果可以看到 code executer 的運作以及 agents 之間的互動:

>>>>>>>> USING AUTO REPLY...
Cathy (to Jack):

54

---------------

>>>>>>>> USING AUTO REPLY...

***************
Starting a new chat....

***************
Jack (to code_executor_agent_docker):


Execute the following script to verify 54 is bigger than 53.
"""python
print("Too High" if int("54") > 53 else "Too Low")
"""

---------------

>>>>>>>> EXECUTING CODE BLOCK (inferred language is python)...
code_executor_agent_docker (to Jack):

exitcode: 0 (execution succeeded)
Code output: Too High


---------------
Jack (to Cathy):

Too High

p.s. 上述結果可以看到程式碼 print("Too High" if int("54") > 53 else "Too Low"),其中 54 是 Cathy 所產生的,如果換成此處由人類輸入,並且有人猜中程式碼的大致運作原理的話,就可以填塞惡意程式碼造成危害,所以設計此種執行程式碼的環節都必須考慮安全性問題。

以上就是 Nested Chats 的作用介紹。

Nested Chats 讓我們可以把一些邏輯、流程隱藏在 Nested Chats 之中,藉此設計出更有彈性/複雜的 agentic workflow。

隱藏 code_executor_agent_docker 的輸出

前述執行結果可以看到 code executor 的執行結果,如果我們想隱藏某些 agent 的輸出,可以設定 silent 參數為 True,例如:

nested_chats = [
    {
        "recipient": code_executor_agent_using_docker,
        "message": generate_prompt_for_code_executor,
        "summary_method": summarize,
        "silent": True,
        "max_turns": 1,
    },
]

如此一來就不會在對話紀錄的輸出中看到 code_executor_agent_using_docker 的輸出。

其他 Code Executors

除了 DockerCommandLineCodeExecutor 之外,AutoGen 也內建以下 code executors 可以使用(完整清單可以參考官方文件):

  • LocalCommandLineCodeExecutor,使用 localhost 環境執行程式碼的 executor,危險程度較高,若無必要理由,建議使用 DockerCommandLineCodeExecutor。
  • JupyterCodeExecutor,可指定 Jupyter Server 執行程式碼,此功能目前仍屬於實驗階段。

AutoGen 的 Code Executors 也是我個人覺得相當好用、實用的功能,可以補足 LangChain 在此部分的不足。

總結

截至目前為止,我們已經征服 AutoGen 最重要的對話模式(Conversation Patterns),大家或多或少應該具備使用 AutoGen 實作多 agents 協作的能力,基本上只要設計出好的 prompt 與 agents 之間的對話模式(協作流程),一定可以實現不少 Zero-prompting 做不到的工作!

礙於篇幅長度,我們將在下一篇文章介紹更多關於 AutoGen 的使用方法。

以上!

Enjoy!

References

AutoGen | AutoGen

對抗久坐職業傷害

研究指出每天增加 2 小時坐著的時間,會增加大腸癌、心臟疾病、肺癌的風險,也造成肩頸、腰背疼痛等常見問題。

然而對抗這些問題,卻只需要工作時定期休息跟伸展身體即可!

你想輕鬆改變現狀嗎?試試看我們的 PomodoRoll 番茄鐘吧! PomodoRoll 番茄鐘會根據你所設定的專注時間,定期建議你 1 項辦公族適用的伸展運動,幫助你打敗久坐所帶來的傷害!

贊助我們的創作

看完這篇文章了嗎? 休息一下,喝杯咖啡吧!

如果你覺得 MyApollo 有讓你獲得實用的資訊,希望能看到更多的技術分享,邀請你贊助我們一杯咖啡,讓我們有更多的動力與精力繼續提供高品質的文章,感謝你的支持!