如何用 Python 組合指令工具(Command Line Tools)？

Posted on Jul 26, 2024 in Python 程式設計 - 中階 by Amo Chen ‐ 5 min read

覺得我們的內容實用嗎？ MyApollo 電子報讀者募集中！歡迎訂閱電子報!

由於現代有著各式各樣方便的程式、指令工具可以使用，我們不見得需要從頭到尾自行開發，有時候藉由 Python 膠水語言的特性，我們可以輕鬆地整合各種程式、指令工具打造適合我們或者滿足我們需求的工具，不僅省時還省力！而且實務上，我們也很常利用 Python 撰寫整合各種指令工具的程式，以高效執行自動化或者系統管理等工作。

本文將介紹使用 Python 整合各種程式、指令工具的作法。

本文環境

Python 3
macOS

膠水語言(Glue Language)

膠水語言指的是適合用來整合/連接多種程式(program)，以提供服務或功能的程式語言，目前比較知名的膠水語言有 Perl 與 Python。

膠水語言通常都具備高易用性的特點，使開發者得以迅速整合多種既有程式、指令工具，打造出需要的工具或者原型(prototype)。

相較於生產環境(production)等級的程式碼，使用膠水語言開發工具更追求借力使力迅速達成目標，而非追求強固性、高穩定性，所以很適合用於開發學術研究所需的程式、輕量級的資料處理流程等應用。

接下來，本文將介紹多種 Python 整合各種程式、指令工具的方法。

Pipeline / 管道

Pipeline 是 Unix-like 系統常見組合各種指令的方法，例如將 ls -alh 指令輸出結果結合分頁指令 less 就可以把 ls -alh 結果分成不同的頁面：

$ ls -alh | less

它的原理其實是將 ls -alh 的輸出(stdout)，輸入到 less 指令的 stdin。

Python 如果要做到相同的功能，就是讀取 sys.stdin 即可，例如下列範例程式：

read_stdin.py

import sys

if __name__ == '__main__':
    for line in sys.stdin:
        print(f"Processing line: {line.strip()}")

同樣我們試著用 ls -alh 指令結合 pipeline (| 符號)，讓上述 Python 能夠讀取到 ls -alh 的輸出(stdout)：

$ ls -alh | python read_stdin.py

上述指令會輸出類似以下的執行結果：

Processing line: total 8
Processing line: drwxr-xr-x   3 abc  staff    96B  7 26 14:22 .
Processing line: drwxr-xr-x  49 abc  staff   1.5K  7 26 14:21 ..
Processing line: -rw-r--r--   1 abc  staff   116B  7 26 14:22 read_stdin.py

有人可能會覺得指令還要多輸入 python 很麻煩，我們可以修改程式碼，在程式的第 1 行加上 #!/usr/bin/env python，變成：

#!/usr/bin/env python

import sys

if __name__ == '__main__':
    for line in sys.stdin:
        print(f"Processing line: {line.strip()}")

p.s. #! 符號稱為 Shebang 或 Hashbang

其實 #!/usr/bin/env python 指的是當這此檔案具有可執行權限時，而且我們需要執行此檔案時，先從環境變數 PATH 中找到 python interpreter，找到之後再用它執行此檔案，之所以使用 /usr/bin/env 是因為每個作業系統的 python 路徑不一定相同，所以交給 env 指令找比較好。

如果你知道 python 指令的切確路徑，就可以換掉 #! 符號之後的路徑，例如：

#!/usr/bin/python3

接著，我們必須確保該檔案具有可執行權限，使用以下指令加上可執行權限：

$ chmod +x read_stdin.py

最後，就可以使用更簡潔的指令：

$ ls -alh | ./read_stdin.py

I/O Redirection

除了 pipeline 用法之外，還有 1 種稱為 I/O redirection 的使用方法，它的使用方法有 3 種。

輸入檔案內容：

$ command < input_file

輸出 stdout 到檔案：

$ command > output_file

或者同時使用：

$ command < input_file > output_file

I/O redirection 同樣可以讀取 sys.stdin 得到輸入，我們可以使用以下指令輸出 1 個檔案：

$ ls -alh > files.txt

接著，用以下指令測試 read_stdin.py 能否讀取到來自 files.txt 的內容：

$ ./read_stdin.py < files.txt

上述指令執行結果如下，可以看到我們從 sys.stdin 讀取到檔案的內容：

Processing line: total 8
Processing line: drwxr-xr-x   3 abc  staff    96B  7 26 14:22 .
Processing line: drwxr-xr-x  49 abc  staff   1.5K  7 26 14:21 ..
Processing line: -rw-r--r--   1 abc  staff   116B  7 26 14:22 read_stdin.py

結合 subprocess 模組

知道前述組合指令的方式之後，我們可以使用 subprocess 模組，直接用 Python 執行指令工具，並讀取其輸出，同樣以 ls -alh | ./read_stdin.py 為例，全部整合到 Python 程式碼內的話，會變成：

import subprocess

# Run a shell command and capture its output
result = subprocess.run(
    ['ls', '-alh'],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)

if result.returncode == 0:
    # Output the results
    print("Standard Output:")
    for line in result.stdout.split('\n'):
        print(line)
else:
    print("Standard Error:")
    print(result.stderr)

上述程式碼的重點在於使用 subprocess.run() 函式執行指令 ls -alh，Python 建議指令最好以 sequence 表示，可以避免 command injection 攻擊，例如以下模擬指令被惡意放入 cat /etc/passwd 指令的情況：

result = subprocess.run(
    ['ls', '-alh; cat /etc/passwd'],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)

此外，參數 stdout=subprocess.PIPE 與 stderr=subprocess.PIPE 指的是將 ls -alh 指令的 stdout 與 stderr 都輸出到 pipeline，搭配 text=True 參數，text=True 代表將 stdin, stdout, stderr 會以 text mode 開啟，可以理解轉成 Python string 型態。如此一來，我們可以透過 result 變數的 stdout 屬性存取 ls -alh 的輸出：

for line in result.stdout.split('\n'):
    print(line)

如果沒有 text=True，上述程式碼必須改為下列形式，也就是必須先轉成 Python string 型別再處理：

for line in result.stdout.decode()split('\n'):
    print(line)

result 變數其實是 CompletedProcess 類別的實例(instance)，它存有 returncode , stdout , stderr 等屬性與方法。

其中 returncode 是每 1 個指令執行結束之後，會有 1 個整數代表它執行是否成功，如果為整數 0 就代表執行成功，沒有錯誤。這並不是 Python 獨特的設計，而是 1 個稱為 exit status 的機制，該機制是藉由將執行結果寫到 1 個特殊的 shell parameter $? 中，讓我們藉此判斷上 1 個指令是否執行成功，我們可以在 shell 中實驗以下 2 個指令看看：

$ ls -alh > /dev/null
$ echo $?
0

上述執行結果中可以看到 $? 是 0，就代表 ls -alh > /dev/null 指令執行成功。

再試 1 個必定會錯的指令：

$ not_a_command
...(略)...
$ echo $?
127

上述結果可以看到 exit status 為 127，就代表 command is not found。如果是 126 則代表 command is found but is not executable。

將 stdout / stderr 直接輸出到檔案

前述範例使用 stdout=subprocess.PIPE 參數將 stdout 輸出到 pipe，如果我們想直接將 stdout, stderr 直接輸出到檔案，可以改成 Python 的 file object:

import subprocess

with open('ls_out', 'wb') as out, open('ls_err', 'wb') as err:
    result = subprocess.run(
        ['ls', '-alh'],
        stdout=out,
        stderr=err,
    )

結合多個指令

如果要結合多個指令，則是使用多個 subprocess.run()，每個 subprocess.run() 各自執行 1 個指令，並且把前 1 個指令的輸出，當作下 1 個指令的輸出，例如下列指令會取出 ls -alh 的檔名部分：

$ ls -alh | awk '{print $9}'

如果用 Python subprocess 模組串起來，就會變成：

import subprocess

ls_result = subprocess.run(
    ['ls', '-alh'],
    stdout=subprocess.PIPE,
    text=True
)

awk_result = subprocess.run(
    ['awk', '{print $9}'],
    input=ls_result.stdout,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

print(awk_result.stdout)

上述程式碼可以看到 input=ls_result.stdout 的部分，其實就是把 ls -alh 的輸出，變成 awk '{print $9}' 的輸入。

這就是結合多個指令的作法。

至此，大家應該就具備使用 Python 作為膠水語言開發整合各種指令工具的能力了！

總結

使用 Python 整合各種指令工具是一種方便、彈性的做法，特別是使用 subprocess 模組，可以讓開發者有效地執行和管理系統等各種工作（建議限個人/團隊使用，此種做法不適合需要強固性與穩定性的 production level 應用與服務）。

以上！

Enjoy!

References

subprocess — Subprocess management

Exit Status

覺得我們的內容實用嗎？ MyApollo 電子報讀者募集中！歡迎訂閱電子報!

python subprocess

如何用 Python 組合指令工具(Command Line Tools)？

本文環境

膠水語言(Glue Language)

Pipeline / 管道

I/O Redirection

結合 subprocess 模組

將 stdout / stderr 直接輸出到檔案

結合多個指令

總結

References

對抗久坐職業傷害

贊助我們的創作

如何用 Python 組合指令工具(Command Line Tools)？

本文環境 #

膠水語言(Glue Language) #

Pipeline / 管道 #

I/O Redirection #

結合 subprocess 模組 #

將 stdout / stderr 直接輸出到檔案 #

結合多個指令 #

總結 #

References #

對抗久坐職業傷害

贊助我們的創作

你可能也會感興趣的文章

Python subprocess 模組使用教學

用 Python 學網路程式設計重要概念 — 從 asyncio 到 asyncio 混搭 Multi-process

帶你搞懂 Python 的淺層複製(shallow copy)與深層複製(deep copy)

本文環境

膠水語言(Glue Language)

Pipeline / 管道

I/O Redirection

結合 subprocess 模組

將 stdout / stderr 直接輸出到檔案

結合多個指令

總結

References