2015-08-27 Tail of a text file encoded in utf-8ΒΆ

Funny thing I went through when I wanted to get the tail of a text file:

with open(filename, "r", encoding="utf8") as f:
    rows = f.readlines()
tail = rows[-10:] if len(rows) > 10 else rows

When the file is too big, it is really tempting to do:

with open(filename, "r", encoding="utf8") as f:
    f.seek(size - threshold)  # added line
    rows = f.readlines()
tail = rows[-10:] if len(rows) > 10 else rows

However, because of the encoding, the cursor might fall in the middle of an utf8 character (they can have 1 or 2 character). In that case, the next instruction readlines fails because of an encoding error. The cursor needs to be moved by one character. See function file_tail.