Python 模拟远程 tail -f？

发布于 2024-12-08 08:02:51 字数 987 浏览 0 评论 0原文

我们有几个应用程序服务器和一个中央监控服务器。

我们当前正在从监控服务器运行带有“tail -f”的 ssh，以从应用程序服务器实时传输多个文本日志文件。

除了整个方法的脆弱性之外，问题还在于终止 ssh 进程有时会留下僵尸尾进程。我们已经尝试使用 -t 来创建伪终端，但它有时仍然会留下僵尸进程，而且 -t 显然也会在我们使用的作业调度产品的其他地方引起问题。

作为一个廉价而肮脏的解决方案，直到我们能够获得适当的集中式日志记录（希望是 Logstash 和 RabbitMQ），我希望编写一个简单的 Python 包装器，它将启动 ssh 和“tail -f”，仍然捕获输出，但是将 PID 存储到磁盘上的文本文件中，以便我们可以在需要时终止相应的尾部进程。

我首先尝试使用 subprocess.Popen，但后来我遇到了实际实时获取“tail -f”输出的问题（然后需要将其重定向到文件） - 显然会有大量阻塞/缓冲问题。

一些消息来源似乎建议使用 pexpect、pxssh 或类似的东西。理想情况下，如果可能的话，我只想使用 Python 及其包含的库 - 但是，如果库确实是实现此目的的唯一方法，那么我对此持开放态度。

有没有一种简单的方法可以让Python使用“tail -f”启动ssh，将输出实时打印到本地STDOUT（这样我可以重定向到本地文件），并将PID保存到文件中稍后再杀？或者即使我不使用带有 tail -f 的 ssh，仍然可以（近）实时地传输远程文件，包括将 PID 保存到文件中？

干杯， Victor

编辑：只是为了澄清 - 当我们终止 SSH 进程时，我们希望尾部进程终止。

我们想要从监控服务器启动 ssh 和“tail -f”，然后当我们 Ctlr-C 时，远程机器上的 tail 进程也应该终止 - 我们不希望它这样做留在后面。通常，带有 -t 的 ssh 应该可以修复它，但它并不完全可靠，原因我不明白，而且它与我们的作业调度不能很好地配合。

因此，使用 screen 来让另一端的进程保持活动状态并不是我们想要的。

原文

We have several application servers, and a central monitoring server.

We are currently running ssh with "tail -f" from the monitoring server to stream several text logfiles in realtime from the app servers.

The issue, apart from the brittleness of the whole approach is that killing the ssh process can sometimes leave zombie tail processes behind. We've mucked around with using -t to create pseudo-terminals, but it still sometimes leaves the zombie processes around, and -t is apparently also causing issues elsewhere with the job scheduling product we're using.

As a cheap-and-dirty solution until we can get proper centralised logging (Logstash and RabbitMQ, hopefully), I'm hoping to write a simple Python wrapper that will start ssh and "tail -f", still capture the output, but store the PID to a textfile on disk so we can kill the appropriate tail process later if need be.

I at first tried using subprocess.Popen, but then I hit issues with actually getting the "tail -f" output back in realtime (which then needs to be redirected to a file) - apparently there are going to be a host of blocking/buffer issues.

A few sources seemed to recommend using pexpect, or pxssh or something like that. Ideally I'd like to use just Python and it's included libraries, if possible - however, if a library is really the only way to do this, then I'm open to that.

Is there a nice easy way of getting Python to start up ssh with "tail -f", get the output in realtime printed to local STDOUT here (so I can redirect to a local file), and also saving the PID to a file to kill later? Or even if I don't use ssh with tail -f, some way of still streaming a remote file in (near) realtime that includes saving the PID to a file?

Cheers,
Victor

EDIT: Just to clarify - we want the tail process to die when we kill the SSH process.

We want to start ssh and "tail -f" from the monitoring server, then when we Ctlr-C that, the tail process on the remote box should die as well - we don't want it to stay behind. Normally ssh with -t should fix it, but it isn't fully reliable, for reasons I don't understand, and it doesn't play nicely with our job scheduling.

Hence, using screen to keep the process alive at the other end is not what we want.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

追风人 2024-12-15 08:02:51

我知道这不能回答您的问题，但是...

也许您可以尝试使用屏幕。如果您的会话丢失，您可以随时重新连接并且尾部仍将运行。它还支持多用户，因此 2 个用户可以查看相同的 tail 命令。

http://en.wikipedia.org/wiki/GNU_Screen

使用名称“log”创建：

screen -S log

断开连接：

[CTRL]+A D

重新附加

screen -r log

当您记住名称时

screen -list

列表要摆脱会话，只需在会话中输入 exit 即可。

I know this doesn't answer your questions, but...

Maybe you could try using screen. If your session drops, you can always reattach and the tail will still be running. It also supports multiuser, so 2 users can view the same tail command.

http://en.wikipedia.org/wiki/GNU_Screen

create with the name "log":

screen -S log

disconnect:

[CTRL]+A D

reattach

screen -r log

list when you can remember the name

screen -list

To get rid of the session, just type exit while in it.

回复收藏 0 原文

〆凄凉。 2024-12-15 08:02:51

我认为屏幕的想法是最好的想法，但如果你不想 ssh 并且你想要一个 python 脚本来完成它。这是一种获取信息的简单 Pythonic XMLRPC 方法。仅当某些内容已附加到相关文件时，它才会更新。

这是客户端文件。你告诉它你想读取哪个文件以及它在哪台计算机上。

#!/usr/bin/python
# This should be run on the computer you want to output the files
# You must pass a filename and a location
# filename must be the full path from the root directory, or relative path
# from the directory the server is running
# location must be in the form of http://location:port (i.e. http:localhost:8000)

import xmlrpclib, time, sys, os

def tail(filename, location):
   # connect to server
   s = xmlrpclib.ServerProxy(location)

   # get starting length of file
   curSeek = s.GetSize(filename)

   # constantly check
   while 1:
      time.sleep(1) # make sure to sleep

      # get a new length of file and check for changes
      prevSeek = curSeek

      # some times it fails if the file is being writter to,
      # we'll wait another second for it to finish
      try:
         curSeek = s.GetSize(filename)
      except:
         pass

      # if file length has changed print it
      if prevSeek != curSeek:
         print s.tail(filename, prevSeek),


def main():
   # check that we got a file passed to us
   if len(sys.argv) != 3 or not os.path.isfile(sys.argv[1]):
      print 'Must give a valid filename.'
      return

   # run tail function
   tail(sys.argv[1], sys.argv[2])

main()

这是您将在每台拥有您要查看的文件的计算机上运行它的服务器。没什么特别的。如果需要，您可以将其守护进程化。您只需运行它，如果您告诉客户端它在哪里并且您打开了正确的端口，则客户端应该连接到它。

#!/usr/bin/python
# This runs on the computer(s) you want to read the file from
# Make sure to change out the HOST and PORT variables
HOST = 'localhost'
PORT = 8000

from SimpleXMLRPCServer import SimpleXMLRPCServer
from SimpleXMLRPCServer import SimpleXMLRPCRequestHandler

import time, os

def GetSize(filename):
   # get file size
   return os.stat(filename)[6]

def tail(filename, seek):
   #Set the filename and open the file
   f = open(filename,'r')

   #Find the size of the file and move to the end
   f.seek(seek)
   return f.read()

def CreateServer():
   # Create server
   server = SimpleXMLRPCServer((HOST, PORT),
                               requestHandler=SimpleXMLRPCRequestHandler)

# register functions
   server.register_function(tail, 'tail')
   server.register_function(GetSize, 'GetSize')

   # Run the server's main loop
   server.serve_forever()

# start server
CreateServer()

理想情况下，您运行服务器一次，然后从客户端运行“python client.py example.log http://somehost:8000 “它应该开始了。希望有帮助。

I think the screen idea is the best idea, but if you're not wanting to ssh and you want a python script to do it. Here is a simple pythonic XMLRPC way of getting the info. It will only update when something has been appended to the file in question.

This is the client file. You tell this which file you want to read from and what computer its on.

#!/usr/bin/python
# This should be run on the computer you want to output the files
# You must pass a filename and a location
# filename must be the full path from the root directory, or relative path
# from the directory the server is running
# location must be in the form of http://location:port (i.e. http:localhost:8000)

import xmlrpclib, time, sys, os

def tail(filename, location):
   # connect to server
   s = xmlrpclib.ServerProxy(location)

   # get starting length of file
   curSeek = s.GetSize(filename)

   # constantly check
   while 1:
      time.sleep(1) # make sure to sleep

      # get a new length of file and check for changes
      prevSeek = curSeek

      # some times it fails if the file is being writter to,
      # we'll wait another second for it to finish
      try:
         curSeek = s.GetSize(filename)
      except:
         pass

      # if file length has changed print it
      if prevSeek != curSeek:
         print s.tail(filename, prevSeek),


def main():
   # check that we got a file passed to us
   if len(sys.argv) != 3 or not os.path.isfile(sys.argv[1]):
      print 'Must give a valid filename.'
      return

   # run tail function
   tail(sys.argv[1], sys.argv[2])

main()

This is the server you will run this on each computer that has a file you want to look at. Its nothing fancy. You can daemonize it if you want. You just run it, and you client should connect to it if you tell the client where it is and you have the right ports open.

#!/usr/bin/python
# This runs on the computer(s) you want to read the file from
# Make sure to change out the HOST and PORT variables
HOST = 'localhost'
PORT = 8000

from SimpleXMLRPCServer import SimpleXMLRPCServer
from SimpleXMLRPCServer import SimpleXMLRPCRequestHandler

import time, os

def GetSize(filename):
   # get file size
   return os.stat(filename)[6]

def tail(filename, seek):
   #Set the filename and open the file
   f = open(filename,'r')

   #Find the size of the file and move to the end
   f.seek(seek)
   return f.read()

def CreateServer():
   # Create server
   server = SimpleXMLRPCServer((HOST, PORT),
                               requestHandler=SimpleXMLRPCRequestHandler)

# register functions
   server.register_function(tail, 'tail')
   server.register_function(GetSize, 'GetSize')

   # Run the server's main loop
   server.serve_forever()

# start server
CreateServer()

Ideally you run the server once, then from the client run "python client.py sample.log http://somehost:8000" and it should start going. Hope that helps.

回复收藏 0 原文

榕城若虚 2024-12-15 08:02:51

paramiko 模块支持通过 ssh 与 python 连接。

http://www.lag.net/paramiko/

pysftp 有一些使用它的示例以及执行命令方法可能正是您所寻找的。它将创建一个类似于您执行的命令的对象的文件。我不能说它是否为您提供实时数据。

http://code.google.com/p/pysftp/

回复收藏 0 原文

暖树树初阳… 2024-12-15 08:02:51

我已经用代码（paramiko）发布了一个关于类似问题的问题

tail -f 通过 ssh 与 Paramiko 的延迟越来越大

回复收藏 0 原文

太阳公公是暖光 2024-12-15 08:02:51

我写了一个函数来做到这一点：

import paramiko
import time
import json

DEFAULT_MACHINE_USERNAME="USERNAME"
DEFAULT_KEY_PATH="DEFAULT_KEY_PATH"

def ssh_connect(machine, username=DEFAULT_MACHINE_USERNAME,
                key_filename=DEFAULT_KEY_PATH):
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh.connect(hostname=machine, username=username, key_filename=key_filename)
    return ssh

def tail_remote_file(hostname, filepath, key_path=DEFAULT_KEY_PATH,
                     close_env_variable="CLOSE_TAIL_F", env_file='~/.profile'):
    ssh = ssh_connect(hostname, key_filename=key_path)

    def set_env_variable(to_value):
        to_value_str = "true" if to_value else "false"
        from_value_str = "false" if to_value else "true"
        ssh.exec_command('sed -i \'s/export %s=%s/export %s=%s/g\' %s' %
                         (close_env_variable, from_value_str,
                          close_env_variable, to_value_str, env_file))
        time.sleep(1)

    def get_env_variable():
        command = "source .profile; echo $%s" % close_env_variable
        stdin, stdout_i, stderr = ssh.exec_command(command)
        print(command)
        out = stdout_i.read().replace('\n', '')
        return out

    def get_last_line_number(lines_i, line_num):
        return int(lines_i[-1].split('\t')[0]) + 1 if lines_i else line_num

    def execute_command(line_num):
        command = "cat -n %s | tail --lines=+%d" % (filepath, line_num)
        stdin, stdout_i, stderr = ssh.exec_command(command)
        stderr = stderr.read()
        if stderr:
            print(stderr)
        return stdout_i.readlines()

    stdout = get_env_variable()
    if not stdout:
        ssh.exec_command("echo 'export %s=false' >> %s" %
                         (close_env_variable, env_file))
    else:
        ssh.exec_command(
            'sed -i \'s/export %s=true/export %s=false/g\' %s' %
            (close_env_variable, close_env_variable, env_file))
    set_env_variable(False)

    lines = execute_command(0)
    last_line_num = get_last_line_number(lines, 0)

    while not json.loads(get_env_variable()):
        for l in lines:
            print('\t'.join(t.replace('\n', '') for t in l.split('\t')[1:]))
        last_line_num = get_last_line_number(lines, last_line_num)
        lines = execute_command(last_line_num)
        time.sleep(1)

    ssh.close()

I wrote a function that do that:

import paramiko
import time
import json

DEFAULT_MACHINE_USERNAME="USERNAME"
DEFAULT_KEY_PATH="DEFAULT_KEY_PATH"

def ssh_connect(machine, username=DEFAULT_MACHINE_USERNAME,
                key_filename=DEFAULT_KEY_PATH):
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh.connect(hostname=machine, username=username, key_filename=key_filename)
    return ssh

def tail_remote_file(hostname, filepath, key_path=DEFAULT_KEY_PATH,
                     close_env_variable="CLOSE_TAIL_F", env_file='~/.profile'):
    ssh = ssh_connect(hostname, key_filename=key_path)

    def set_env_variable(to_value):
        to_value_str = "true" if to_value else "false"
        from_value_str = "false" if to_value else "true"
        ssh.exec_command('sed -i \'s/export %s=%s/export %s=%s/g\' %s' %
                         (close_env_variable, from_value_str,
                          close_env_variable, to_value_str, env_file))
        time.sleep(1)

    def get_env_variable():
        command = "source .profile; echo $%s" % close_env_variable
        stdin, stdout_i, stderr = ssh.exec_command(command)
        print(command)
        out = stdout_i.read().replace('\n', '')
        return out

    def get_last_line_number(lines_i, line_num):
        return int(lines_i[-1].split('\t')[0]) + 1 if lines_i else line_num

    def execute_command(line_num):
        command = "cat -n %s | tail --lines=+%d" % (filepath, line_num)
        stdin, stdout_i, stderr = ssh.exec_command(command)
        stderr = stderr.read()
        if stderr:
            print(stderr)
        return stdout_i.readlines()

    stdout = get_env_variable()
    if not stdout:
        ssh.exec_command("echo 'export %s=false' >> %s" %
                         (close_env_variable, env_file))
    else:
        ssh.exec_command(
            'sed -i \'s/export %s=true/export %s=false/g\' %s' %
            (close_env_variable, close_env_variable, env_file))
    set_env_variable(False)

    lines = execute_command(0)
    last_line_num = get_last_line_number(lines, 0)

    while not json.loads(get_env_variable()):
        for l in lines:
            print('\t'.join(t.replace('\n', '') for t in l.split('\t')[1:]))
        last_line_num = get_last_line_number(lines, last_line_num)
        lines = execute_command(last_line_num)
        time.sleep(1)

    ssh.close()

回复收藏 0 原文