Linux tail 到底是怎么实现的

阅读：评论：0

Linux tail 到底是怎么实现的

Linux tail, tail -f 实现原理

面试时被问到Linux的tail命令是怎么实现的，被问时一脸懵逼，凭着想象答了一下，面试官听完后就挂掉了面试。后来看了下源码，大概知道了一点原理。

Linux的tail命令的功能简单来说就是输出文件的最后n行，如果不指定n则默认是最后10行。源码里tail的核心函数如下：链接：tail.csrc - coreutils.git - GNU coreutils

/* Print the last N_LINES lines from the end of file FD.Go backward through the file, reading 'BUFSIZ' bytes at a time (exceptprobably the first), until we hit the start of the file or haveread NUMBER newlines.START_POS is the starting position of the read pointer for the fileassociated with FD (may be nonzero).END_POS is the file offset of EOF (one larger than offset of last byte).Return true if successful.  */static bool
file_lines (char const *pretty_filename, int fd, uintmax_t n_lines,off_t start_pos, off_t end_pos, uintmax_t *read_pos)
{char buffer[BUFSIZ];size_t bytes_read;off_t pos = end_pos;if (n_lines == 0)return true;/* Set 'bytes_read' to the size of the last, probably partial, buffer;0 < 'bytes_read' <= 'BUFSIZ'.  */bytes_read = (pos - start_pos) % BUFSIZ;if (bytes_read == 0)bytes_read = BUFSIZ;/* Make 'pos' a multiple of 'BUFSIZ' (0 if the file is short), so that allreads will be on block boundaries, which might increase efficiency.  */pos -= bytes_read;xlseek (fd, pos, SEEK_SET, pretty_filename);bytes_read = safe_read (fd, buffer, bytes_read);if (bytes_read == SAFE_READ_ERROR){error (0, errno, _("error reading %s"), quoteaf (pretty_filename));return false;}*read_pos = pos + bytes_read;/* Count the incomplete line on files that don't end with a newline.  */if (bytes_read && buffer[bytes_read - 1] != line_end)--n_lines;do{/* Scan backward, counting the newlines in this bufferfull.  */size_t n = bytes_read;while (n){char const *nl;nl = memrchr (buffer, line_end, n);if (nl == NULL)break;n = nl - buffer;if (n_lines-- == 0){/* If this newline isn't the last character in the buffer,output the part that is after it.  */xwrite_stdout (nl + 1, bytes_read - (n + 1));*read_pos += dump_remainder (false, pretty_filename, fd,end_pos - (pos + bytes_read));return true;}}/* Not enough newlines in that bufferfull.  */if (pos == start_pos){/* Not enough lines in the file; print everything fromstart_pos to the end.  */xlseek (fd, start_pos, SEEK_SET, pretty_filename);*read_pos = start_pos + dump_remainder (false, pretty_filename, fd,end_pos);return true;}pos -= BUFSIZ;xlseek (fd, pos, SEEK_SET, pretty_filename);bytes_read = safe_read (fd, buffer, BUFSIZ);if (bytes_read == SAFE_READ_ERROR){error (0, errno, _("error reading %s"), quoteaf (pretty_filename));return false;}*read_pos = pos + bytes_read;}while (bytes_read > 0);return true;
}

从上面的源码可以看到，tail是先用xlseek定位到了文件末尾（L29），然后从文件末尾每次读取BUFSIZ个字符buffer（L30, L78）。之后从buffer中统计line_end（也就是换行符）的个数（L47~63）。每次读完BUFSIZ个字符后，都需要lseek到上一个BUFSIZE的开头（L75~76），直至统计到n_lines个后就写到输出流返回（L54~62）。

可以看到，整个过程不需要将整个文件读取到内存中，就是用了大量的xlseek（也就是lseek的封装）。lseek的原理在这里不再赘述，其功能是记录当前fd的读取位置，跳到指定的offset处。

这个函数的输入中有参数start_pos和end_pos，指代文件的起始和终止位置，这个也是用lseek函数找到的，在外层函数中有体现：tail_lines。

tail还有一个功能，就是加上-f（或者tailf），这样可以一直看文件中新增的内容。那这是怎么实现的呢。

其实说起原理也很简单，就是通过inotify或者轮询的机制去检测文件是否有新增。inotify是Linux2.6.13后引入的文件事件监听系统，可以直接用来监听文件的相关事件，如果检测到文件发生改变，则输出新增内容；如果Linux没有inotify，则会轮询文件有无更新，检测到没更新则sleep 1s再查看。tail -f 的相关代码如下，其中涉及到许多Linux文件系统的用法，在此就不再进行深度剖析了。

inotify方法实现：tail_forever_inotify

轮询方法实现：tail_forever

本文发布于:2024-02-05 04:31:32，感谢您对本站的认可！

本文链接：https://www.4u4v.net/it/170724215863059.html

上一篇：tail 命令 – 查看文件尾部内容

下一篇：linux命令 tail

标签：是怎么 Linux tail

留言与评论（共有 0 条评论）