第一句子网 > Python3---可迭代对象（iterable）迭代器（iterator）生成器（generator）的理解和应用

Python3---可迭代对象（iterable）迭代器（iterator）生成器（generator）的理解和应用

时间：2022-03-21 06:05:15

文章目录

1. 可迭代对象（iterable）1）.可迭代性----for循环原理2）.可迭代对象的特征：3）.可迭代对象的源码：2. 迭代器（iterator）1）.迭代器的源码：2）.可迭代对象 & 迭代器的区别3）.自定义迭代器---斐波那契数列4）.迭代器的应用场景？3. 生成器（generator）1）.生成器的特征？2）.生成器的创建？3）.yield 的工作流程4）.生成器中的yield & return5）.生成器中的send（）& next（）区别6）.生成器的应用---简单的生产/消费模型7）.生成器的应用---yield多任务切换(协程模拟)8）.生成器的应用---大文件的读取

1. 可迭代对象（iterable）

1）.可迭代性----for循环原理

1.字符串，列表，元祖，字典，集合、文件等等，都是可迭代对象，都具备可迭代特性。
2.具备可迭代性，不代表就是可迭代对象

1. 包含__getitem__魔法方法：具备可迭代性

from collections import Iterable# 1、只实现__getitem__class A:def __init__(self):self.data = [1, 2, 3]def __getitem__(self, index):return self.data[index]a = A()print(isinstance(a, Iterable))# 判断是否为可迭代对象for i in a:print(i)# 结果：False123

2. 包含__getitem__魔法方法 & __iter__魔法方法：可迭代对象

from collections import Iterableclass A:def __init__(self):self.data = [1, 2, 3]self.data1 = [4, 5, 6]def __iter__(self):return iter(self.data1)def __getitem__(self, index):return self.data[index]a = A()print(isinstance(a, Iterable))# 判断是否为可迭代对象for i in a:print(i)# 结果为：True456

如以上代码所示，如果只是实现__getitem__，for in 循环体会自动调用__getitem__函数，并自动对Index从0开始自增，并将对应索引位置的值赋值给 i，直到引发IndexError错误如果还实现了__iter__，则会忽略__getitem__，只调用__iter__，并对__iter__返回的迭代器进行成员遍历，并自动将遍历的成员逐次赋值给 i，直到引发StopIteration

2）.可迭代对象的特征：

字符串，列表，元祖，字典，集合、文件等等，都是可迭代对象实现了__iter__方法的对象就叫做可迭代对象,_iter__方法可以返回一个迭代器对象，然后通过next()方法就能获取一个一个的元素。直观理解就是，能用for循环进行迭代的对象就是可迭代对象。

【如下图很重要、很重要、很重要！！！】

理解：
例如X列表对象，可以通过iter()方法获取到迭代器，通过迭代器的next（）方法就能获取到对象元素，则说明X列表对象就是可迭代对象中间有个新概念—迭代器，这个后续详解…

my_list = ["hello", "alien", "world"]# 如下两种方法效果一样，都是获取到迭代器list_type_iterator = my_list.__iter__()# 通过此方法，查看可迭代对象源码# list_type_iterator = iter(my_list)print(list_type_iterator)item = list_type_iterator.next()# 通过此方法，查看迭代器源码print(item)item = list_type_iterator.next()print(item)item = list_type_iterator.next()print(item)

<listiterator object at 0x10a8fa3d0>helloalienworld

3）.可迭代对象的源码：

# 通过如上代码中，通过__iter__（)函数进入到源码内部（Ctrl + B），让我们一探究竟list_type_iterator = my_list.__iter__()进入到__builtin__.py文件中，这个文件定义了python3中常用的数据类型。我们在此文件中搜索__iter__方法，发现代码如下情况的代码：

class list(object):"""list() -> new empty listlist(iterable) -> new list initialized from iterable's items"""def append(self, p_object): # real signature unknown; restored from __doc__""" L.append(object) -- append object to end """passdef __iter__(self): # real signature unknown; restored from __doc__""" x.__iter__() <==> iter(x) """pass

class dict(object):"""dict() -> new empty dictionarydict(mapping) -> new dictionary initialized from a mapping object's(key, value) pairsdict(iterable) -> new dictionary initialized as if via:d = {}for k, v in iterable:d[k] = vdict(**kwargs) -> new dictionary initialized with the name=value pairsin the keyword argument list. For example: dict(one=1, two=2)"""def clear(self): # real signature unknown; restored from __doc__""" D.clear() -> None. Remove all items from D. """passdef __iter__(self): # real signature unknown; restored from __doc__""" x.__iter__() <==> iter(x) """pass

class file(object):"""file(name[, mode[, buffering]]) -> file objectOpen a file. The mode can be 'r', 'w' or 'a' for reading (default),writing or appending. The file will be created if it doesn't existwhen opened for writing or appending; it will be truncated whenopened for writing. Add a 'b' to the mode for binary files.Add a '+' to the mode to allow simultaneous reading and writing.If the buffering argument is given, 0 means unbuffered, 1 means linebuffered, and larger numbers specify the buffer size. The preferred wayto open a file is with the builtin open() function.Add a 'U' to mode to open the file for input with universal newlinesupport. Any line ending in the input file will be seen as a '\n'in Python. Also, a file so opened gains the attribute 'newlines';the value for this attribute is one of None (no newline read yet),'\r', '\n', '\r\n' or a tuple containing all the newline types seen.'U' cannot be combined with 'w' or '+' mode."""def readline(self, size=None): # real signature unknown; restored from __doc__"""readline([size]) -> next line from the file, as a string.Retain newline. A non-negative size argument limits the maximumnumber of bytes to return (an incomplete line may be returned then).Return an empty string at EOF."""passdef close(self): # real signature unknown; restored from __doc__"""close() -> None or (perhaps) an integer. Close the file.Sets data attribute .closed to True. A closed file cannot be used forfurther I/O operations. close() may be called more than once withouterror. Some kinds of file objects (for example, opened by popen())may return an exit status upon closing."""passdef __iter__(self): # real signature unknown; restored from __doc__""" x.__iter__() <==> iter(x) """pass

我们发现常用的可迭代对象都定义了__iter__方法，所以后续我们自动以迭代器的时候，也需要这个方法。

2. 迭代器（iterator）

1）.迭代器的源码：

# 通过文章最开始的 next()方法，可以进入到迭代器的源码中查看究竟item = list_type_iterator.next()item 即为可迭代对象中的一个元素

@runtime_checkableclass Iterable(Protocol[_T_co]):@abstractmethoddef __iter__(self) -> Iterator[_T_co]: ...# 解释：可迭代对象通过__iter__()方法得到迭代器@runtime_checkableclass Iterator(Iterable[_T_co], Protocol[_T_co]):@abstractmethoddef next(self) -> _T_co: ...# 解释：迭代器通过next()方法得到某个元素def __iter__(self) -> Iterator[_T_co]: ...

通过以上内容，发现迭代器中还有一个独特的方法—next()，这个是为了获取其中一个元素

2）.可迭代对象 & 迭代器的区别

通过如上的代码、源码我们得出结论：
可迭代对象都有一个__iter__函数可迭代对象通过__iter__得到迭代器，迭代器再通过next()方法，可以得到其中的元素每次next（）之后，迭代器会记录当前执行的进度，下次执行next（）的时候，继续上次的位置执行。这样每次迭代的时候元素是连续的。

3）.自定义迭代器—斐波那契数列

class Fib():def __init__(self, max):self.n = 0self.prev = 0self.curr = 1self.max = maxdef __iter__(self):return selfdef __next__(self):if self.n < self.max:value = self.currself.curr += self.prevself.prev = valueself.n += 1return valueelse:raise StopIterationfb = Fib(5)print(fb.__next__())print(fb.__next__())print(fb.__next__())print(fb.__next__())print(fb.__next__())print(fb.__next__())

11235Traceback (most recent call last):File "/Volumes/Develop/iterator_generator.py", line 43, in <module>print(fb.__next__())File "/Volumes/Develop/iterator_generator.py", line 34, in __next__raise StopIterationStopIteration

注意：
1.在迭代器没有数据时，如果再调用next()方法，会抛出StopIteration错误

4）.迭代器的应用场景？

1.为什么要使用迭代器？

迭代器基本不占用内存资源和减少计算周期。因为迭代器的计算是惰性的。可以理解为每次执行next（）方法，每次才计算一次，否者不计算和保存数据。迭代器还可以记录执行的进度，下次再next（）时候，可以从上次结束位置开始。

例如，你想创建个有100000000个数据的斐波那契数列。如果全部生成之后再用，肯定会占用大量的内存资源。如果使用迭代器来处理，就基本可以忽略内存的占用和计算时间的成本。

2. 文件的读取用到迭代器原理

【普通读取方法】

# readlines()方法其实是读取文件所有内容并形成一个list，没一行内容是其中一个元素for line in open("test.txt").readlines():print line# 1.把文件内容一次全部读取并加载到内存中，然后逐行打印。# 2. 当文件很大时，这个方法的内存开销就很大了，如果文件大于内存的时候，程序会崩掉

【迭代器方式读取】

for line in open("test.txt"): #use file iteratorsprint line# 这是最简单也是运行速度最快的写法，他并没显式的读取文件，而是利用迭代器每次读取下一行。

3. 生成器（generator）

1）.生成器的特征？

生成器其实是一种特殊的迭代器，具备迭代器的性质，不过这种迭代器更加优雅。它不需要再像上面的类一样写__iter__()和__next__()方法了，只需要一个yiled关键字。生成器一定是迭代器（反之不成立），因此任何生成器也是以一种懒加载的模式生成值。

2）.生成器的创建？

1. 只要把一个列表生成式的 [ ] 改成 ( )

L = [x * 2 for x in range(5)]print(type(L))G = (x * 2 for x in range(5))print(G)# 结果如下：<type 'list'><generator object <genexpr> at 0x10fa48730>#=================================# 获取生成器中的元素G = (x * 2 for x in range(5))print(next(G))print(next(G))print(next(G))# 结果如下024

2. 用函数创建生成器

def fib(max):n, a, b = 0, 0, 1while n < max:yield ba, b = b, a + bn = n + 1a = fib(10)print(next(a))print(next(a))print(next(a))# 结果：112

3）.yield 的工作流程

def fib(max):n, a, b = 0, 0, 1while n < max:print("yield--------start")yield b# 每次执行next()方法，都执行到这里，并返回一个元素a, b = b, a + b# yield下面的部分，下一次next方法再执行n = n + 1print("yield--------end")fb = fib(5)print(next(fb))print("\n")print(next(fb))print("\n")print(next(fb))# 结果：yield--------start1yield--------endyield--------start1yield--------endyield--------start2

4）.生成器中的yield & return

生成器与迭代器一样，当所有元素迭代完成之后，如果再执行next()函数，会报错。为了优化这个问题，可以使用return解决return可以在迭代完成，返回给特定的【错误信息】，然后通过try捕获StopIteration错误，即可接收到这个【错误信息】

def fib(max):n, a, b = 0, 0, 1while n < max:yield ba, b = b, a + bn = n + 1return 'iter num finish'

1.方式一：

def iter_list(iterator):try:x = next(iterator)print("----->", x)except StopIteration as ret:stop_reason = ret.valueprint(stop_reason)iter_list(fb)iter_list(fb)iter_list(fb)iter_list(fb)iter_list(fb)iter_list(fb)# 结果：-----> 1-----> 1-----> 2-----> 3-----> 5iter num finish

1.方式二：

fb = fib(5)def iter_list(iterator):while True:try:x = next(iterator)print("----->", x)except StopIteration as ret:stop_reason = ret.valueprint(stop_reason)breakiter_list(fb)# 结果：-----> 1-----> 1-----> 2-----> 3-----> 5iter num finish

5）.生成器中的send（）& next（）区别

1.next()的作用是唤醒并继续执行
2.send()的作用是唤醒并继续执行，同时发送一个信息到生成器内部，需要一个变量接收

def fib(max):n, a, b = 0, 0, 1while n < max:temp = yield bprint("\n temp------>", temp)a, b = b, a + bn = n + 1a = fib(10)print(next(a))abc = a.send("hello")print(abc)abc = a.send("alien")print(abc)# 结果：1temp------> hello1temp------> alien2

通过以上send()函数的使用，说明send一次，相当于也next（）一次，而且也传递了值给temp变量接收，说明同时做了2件事情。a.send(“hello”) 的结果，相当于是next（a）的结果send函数的执行，先将传递的值，赋值给temp，然后执行next的功能

6）.生成器的应用—简单的生产/消费模型

def producter(num):print("produce %s product" % num)while num > 0:consume_num = yield numif consume_num:print("consume %s product" % consume_num)num -= consume_numelse:print("consume 1 time")num -= 1else:return "consume finish"p = producter(20)print("start----->", next(p), "\n")abc = p.send(2)print("the rest num---->", abc, "\n")print("the rest num---->", next(p), "\n")# 结果：produce 20 productstart-----> 20 consume 2 productthe rest num----> 18 consume 1 timethe rest num----> 17

7）.生成器的应用—yield多任务切换(协程模拟)

协程的主要特点：

1.协程是非抢占式特点：协程也存在着切换，这种切换是由我们用户来控制的。协程主解决的是IO的操作2.协程有极高的执行效率：因为子程序切换不是线程切换，而是由程序自身控制，因此，没有线程切换的开销，和多线程比，线程数量越多，协程的性能优势就越明显3.协程无需关心多线程锁机制，也无需关心数据共享问题：不需要多线程的锁机制，因为只有一个线程，也不存在同时写变量冲突，在协程中控制共享资源不加锁，只需要判断状态就好了，所以执行效率比多线程高很多

def task1(times):for i in range(times):print('task_1 done the :{} time'.format(i + 1))yielddef task2(times):for i in range(times):print('task_2 done the :{} time'.format(i + 1))yieldgene1 = task1(5)gene2 = task2(5)for i in range(10):next(gene1)next(gene2)# 结果：task_1 done the :1 timetask_2 done the :1 timetask_1 done the :2 timetask_2 done the :2 timetask_1 done the :3 timetask_2 done the :3 timetask_1 done the :4 timetask_2 done the :4 timetask_1 done the :5 timetask_2 done the :5 time