本文目录导读:

在Python中筛选列表数据最常用的方法是列表推导式和filter()函数,以下是几种常见场景的案例:
基础筛选(列表推导式)
# 筛选出所有偶数 numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] even_numbers = [n for n in numbers if n % 2 == 0] print(even_numbers) # [2, 4, 6, 8, 10] # 筛选出大于5的数字 big_numbers = [n for n in numbers if n > 5] print(big_numbers) # [6, 7, 8, 9, 10]
使用filter()函数
# 定义一个判断条件函数
def is_positive(n):
return n > 0
numbers = [-3, -1, 0, 2, 5, 8]
positive = list(filter(is_positive, numbers))
print(positive) # [2, 5, 8]
# 使用lambda表达式简化
positive = list(filter(lambda x: x > 0, numbers))
print(positive) # [2, 5, 8]
筛选字符串
# 筛选包含特定字符的字符串
words = ['apple', 'banana', 'orange', 'grape', 'watermelon']
# 筛选以'a'开头的单词
a_words = [w for w in words if w.startswith('a')]
print(a_words) # ['apple']
# 筛选长度大于5的单词
long_words = [w for w in words if len(w) > 5]
print(long_words) # ['banana', 'orange', 'watermelon']
# 筛选包含'n'的单词
n_words = [w for w in words if 'n' in w]
print(n_words) # ['banana', 'orange', 'watermelon']
复杂条件筛选
# 筛选满足多个条件的元素 numbers = [15, 20, 25, 30, 35, 40, 45, 50] # 筛选能被3整除且大于25的数 result = [n for n in numbers if n % 3 == 0 and n > 25] print(result) # [30, 45] # 筛选能被5整除或小于20的数 result = [n for n in numbers if n % 5 == 0 or n < 20] print(result) # [15, 20, 25, 30, 35, 40, 45, 50] (除了20? 等等, 20也能被5整除)
筛选字典列表
# 筛选学生成绩数据
students = [
{'name': '张三', 'score': 85, 'age': 18},
{'name': '李四', 'score': 92, 'age': 20},
{'name': '王五', 'score': 78, 'age': 19},
{'name': '赵六', 'score': 95, 'age': 21}
]
# 筛选成绩大于80的学生
good_students = [s for s in students if s['score'] > 80]
print(good_students)
# [{'name': '张三', 'score': 85, 'age': 18}, {'name': '李四', 'score': 92, 'age': 20}, {'name': '赵六', 'score': 95, 'age': 21}]
# 筛选年龄大于19且成绩高于90的学生
excellent = [s for s in students if s['age'] > 19 and s['score'] > 90]
print(excellent)
# [{'name': '李四', 'score': 92, 'age': 20}, {'name': '赵六', 'score': 95, 'age': 21}]
处理None值
# 筛选掉None值 data = [1, None, 3, None, 5, None, 7] cleaned = [x for x in data if x is not None] print(cleaned) # [1, 3, 5, 7] # 使用filter直接筛选非None值 cleaned = list(filter(None, data)) print(cleaned) # [1, 3, 5, 7] (注意:0和空字符串也会被过滤掉)
实战案例:文本处理
# 筛选出有效的手机号
contacts = ['13812345678', 'abc', '15987654321', '010-12345678', '12345']
valid_phones = [phone for phone in contacts if phone.isdigit() and len(phone) == 11]
print(valid_phones) # ['13812345678', '15987654321']
# 筛选出邮箱地址
texts = ['user@email.com', 'hello', 'admin@company.cn', 'test123']
emails = [t for t in texts if '@' in t and '.' in t.split('@')[-1]]
print(emails) # ['user@email.com', 'admin@company.cn']
性能对比
import time
numbers = list(range(1000000))
# 方法1:列表推导式(推荐)
start = time.time()
result1 = [n for n in numbers if n % 2 == 0]
print(f"列表推导式: {time.time() - start:.4f}秒")
# 方法2:filter函数
start = time.time()
result2 = list(filter(lambda x: x % 2 == 0, numbers))
print(f"filter函数: {time.time() - start:.4f}秒")
选择建议:
- 列表推导式:最常用,简洁高效,适合大多数场景
- filter()函数:适合已有判断函数,或需要处理迭代器时
- for循环:适合复杂逻辑或需要多个操作时
列表推导式通常是首选,因为它既简洁又高效。