Python 生成器 (Generators)

生成器是一種特殊的迭代器,使用 yield 關鍵字來逐一產生值,而不是一次回傳所有值。這讓生成器非常節省記憶體。

基本概念

一般函數使用 return 回傳值後就結束了,但生成器函數使用 yield 暫停並回傳值,下次呼叫時從暫停的地方繼續:

def simple_generator():
    yield 1
    yield 2
    yield 3

gen = simple_generator()
print(next(gen))  # 1
print(next(gen))  # 2
print(next(gen))  # 3
# print(next(gen))  # StopIteration

生成器函數

包含 yield 的函數就是生成器函數:

def count_up_to(n):
    i = 1
    while i <= n:
        yield i
        i += 1

# 使用 for 迴圈遍歷
for num in count_up_to(5):
    print(num)

輸出:

1
2
3
4
5

生成器 vs List

記憶體比較

import sys

# List - 一次建立所有元素
numbers_list = [x ** 2 for x in range(1000000)]
print(f"List size: {sys.getsizeof(numbers_list)} bytes")

# Generator - 只在需要時產生元素
numbers_gen = (x ** 2 for x in range(1000000))
print(f"Generator size: {sys.getsizeof(numbers_gen)} bytes")

輸出:

List size: 8448728 bytes
Generator size: 200 bytes

生成器只需要很小的記憶體,因為它不會一次產生所有值。

效能比較

# 如果只需要部分元素,生成器更有效率
def find_first_even(numbers):
    for n in numbers:
        if n % 2 == 0:
            return n
    return None

# List 會先建立 100 萬個元素
result = find_first_even([x for x in range(1000000)])

# Generator 只會產生需要的元素
result = find_first_even(x for x in range(1000000))

生成器表達式

類似 List Comprehension,但使用小括號:

# List comprehension
squares_list = [x ** 2 for x in range(5)]

# Generator expression
squares_gen = (x ** 2 for x in range(5))

print(list(squares_gen))  # [0, 1, 4, 9, 16]

yield 的運作方式

def demo_generator():
    print("Start")
    yield 1
    print("After first yield")
    yield 2
    print("After second yield")
    yield 3
    print("End")

gen = demo_generator()

print("Getting first value...")
print(next(gen))

print("\nGetting second value...")
print(next(gen))

print("\nGetting third value...")
print(next(gen))

輸出:

Getting first value...
Start
1

Getting second value...
After first yield
2

Getting third value...
After second yield
3

yield from

yield from 用來從另一個迭代器產生值:

def generator1():
    yield 1
    yield 2

def generator2():
    yield 'a'
    yield 'b'

def combined():
    yield from generator1()
    yield from generator2()

for item in combined():
    print(item)

輸出:

1
2
a
b

雙向通訊

生成器可以接收外部傳入的值:

def echo_generator():
    while True:
        received = yield
        print(f"Received: {received}")

gen = echo_generator()
next(gen)  # 啟動生成器,執行到第一個 yield

gen.send("Hello")  # Received: Hello
gen.send("World")  # Received: World

實用範例:累加器

def running_average():
    total = 0
    count = 0
    average = None
    
    while True:
        value = yield average
        total += value
        count += 1
        average = total / count

avg = running_average()
next(avg)  # 啟動

print(avg.send(10))  # 10.0
print(avg.send(20))  # 15.0
print(avg.send(30))  # 20.0

實用範例

讀取大檔案

def read_large_file(file_path):
    with open(file_path, 'r') as f:
        for line in f:
            yield line.strip()

# 不會一次載入整個檔案
for line in read_large_file('large_file.txt'):
    process(line)

無限序列

def infinite_counter(start=0):
    n = start
    while True:
        yield n
        n += 1

# 可以產生無限的數字
counter = infinite_counter()
for _ in range(5):
    print(next(counter))  # 0, 1, 2, 3, 4

費氏數列

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

fib = fibonacci()
for _ in range(10):
    print(next(fib), end=" ")
# 0 1 1 2 3 5 8 13 21 34

分批處理

def batch(iterable, size):
    batch = []
    for item in iterable:
        batch.append(item)
        if len(batch) == size:
            yield batch
            batch = []
    if batch:
        yield batch

data = range(10)
for b in batch(data, 3):
    print(b)

輸出:

[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
[9]

管道處理

def read_data():
    for i in range(10):
        yield i

def filter_even(numbers):
    for n in numbers:
        if n % 2 == 0:
            yield n

def square(numbers):
    for n in numbers:
        yield n ** 2

# 建立處理管道
pipeline = square(filter_even(read_data()))

for value in pipeline:
    print(value)  # 0, 4, 16, 36, 64

生成器方法

def my_generator():
    try:
        yield 1
        yield 2
        yield 3
    except GeneratorExit:
        print("Generator closed")
    finally:
        print("Cleanup")

gen = my_generator()
print(next(gen))  # 1
gen.close()  # Generator closed, Cleanup
方法說明
next(gen)取得下一個值
gen.send(value)傳送值給生成器
gen.throw(exception)在生成器內部拋出例外
gen.close()關閉生成器