Python Fundamentals Tutorial: Working with Files

10. Working with Files

10.1. File I/O

Getting data into and out of files in Python feels a lot like using the low-level methods of C, but it has all the ease of Python layered on top.

For example, to open a file for writing use the builtin (and very C-like) open() function. But then write the contents of a list with a single call. The open() function returns an open file object and closing the file is done by calling the close() method of that object.

[Note]Note

the writelines() and readlines() methods in Python do not handle EOL characters on your behalf, making the naming a bit confusing. In this example, note the inclusion of the \n character in the elements of the list.

>>> colors = ['red\n', 'yellow\n', 'blue\n']
>>> f = open('colors.txt', 'w')
>>> f.writelines(colors)
>>> f.close()

By default, the open() function returns a file open for reading. Individual lines can be read with the readline() method, which will return an empty string. Since the zero-length string has is not truthy, it makes a simple marker.

>>> f = open('colors.txt')
>>> f.readline()
'red\n'
>>> f.readline()
'yellow\n'
>>> f.readline()
'blue\n'
>>> f.readline()
''
>>> f.close()

Alternatively, all of the lines of the file can be read into a list with one method call and then iterated over from there.

>>> f = open('colors.txt')
>>> f.readlines()
['red\n', 'yellow\n', 'blue\n']
>>> f.close()

However, for large files, reading the contents into memory can be impractical. So it is best to use the file object itself as an iterator, which will consume content from the file as needed with no intermediary memory requirement.

>>> f = open('colors.txt')
>>> for line in f:
...     print line,
...
red
yellow
blue
>>> f.close()

In order to ensure that the files in the above examples were properly closed, they should have been safe-guarded against an abnormal exit (by Exception or other unexpected return) using a finally statement. See the following example, where you can see in the final seek() that the file is closed in the case of proper execution.

>>> f = open('colors.txt')
>>> try:
...     lines = f.readlines()
... finally:
...     f.close()
...
>>> lines
['red\n', 'yellow\n', 'blue\n']

>>> f.seek(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file

In the next example, the code path fails attempting to write to a file opened for reading. The file is still closed in the finally clause.

>>> f = open('colors.txt')
>>> try:
...     f.writelines('magenta\n')
... finally:
...     f.close()
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
IOError: File not open for writing

>>> f.seek(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file

As of Python 2.6, there is a builtin syntax for handling this paradigm, using the with keyword. with creates a context and regardless of how that context is exited, calls the __exit__() method of the object being managed. In the following example, that object is the file f, but this model works for any file-like object (objects with the basic methods of files).

Once again, performing an operation on the file outside of that context shows that it has been closed.

>>> with open('colors.txt') as f:
...     for line in f:
...         print line,
...
red
yellow
blue

>>> f.seek(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file

As mentioned above, file-like objects are objects that look and feel like files. In Python, there is no strict inheritance requirement in order to use an object as if it were another. StringIO is an example of a file-like object that manages its contents in memory instead of on disk.

It is important to be conscious of the file-like behavior here. Note that a seek is required after writing to get back to the beginning and read. Attempting to run the same iterator twice results in no values the second time through.

StringIO.getvalue() returns a newly created string object with the full contents of the StringIO buffer.

>>> colors = ['red\n', 'yellow\n', 'blue\n']
>>> from StringIO import StringIO
>>> buffer = StringIO()
>>> buffer.writelines(colors)
>>> buffer.seek(0)

>>> for line in buffer:
...     print line,
...
red
yellow
blue

>>> for line in buffer:
...     print line,
...

>>> buffer.getvalue()
'red\nyellow\nblue\n'