GatewayWorker分开部署,过一段时间不用就会停止连接,无法执行BusinessWorker

houniao506

您好,请教一个问题:
 
部署情况:
GatewayWorker的Register Gateway BusinessWorker全部是分开部署,分别启动的。全部在部署是一台本机上。
 
问题描述:
我的BusinessWorker过一段时间不执行任务会跟Gateway断开,无法执行后台任务。具体间隔的时长不太清楚,一般是隔一个晚上,第二天执行异步任务,就无法执行,必须Reload就可以再次执行了。只有一直有任务执行就不会出现断开的情况。
 
日志截图:

==> workerman.log <==
2019-02-15 23:07:25 pid:13913 Workerman stop success
2019-02-15 23:07:26 pid:12583 SendBufferToWorker fail. The connections between Gateway and BusinessWorker are not ready. See http://wiki.workerman.net/Error3
2019-02-15 23:07:26 pid:12583 SendBufferToWorker fail. The connections between Gateway and BusinessWorker are not ready. See http://wiki.workerman.net/Error3
2019-02-15 23:38:32 pid:13913 Workerman stopping ...
2019-02-15 23:38:32 pid:13913 Workerman has been stopped
2019-02-15 23:38:34 pid:13994 Workerman restart
2019-02-15 23:38:34 pid:13994 Workerman is stopping ...
2019-02-15 23:38:34 pid:13994 Workerman stop success
2019-02-16 08:35:34 pid:16411 Workerman reload
2019-02-16 08:35:34 pid:13997 Workerman reloading
3616 3 0
3个回答

walkor

有可能业务调用外部接口时没处理好超时阻塞住了,有问题的时候status截图下

  • 暂无评论
houniao506

您好 Walkor
今天早上Businessworker依旧不执行(这个问题,我去年就碰到,当时只能用linux定时任务做Reload)
我确定没有做任何外部接口调用,因为下半夜没有任何业务执行,感谢您的再次回复!
 
补充一下我用的是系统环境 CentOS Linux release 7.4.1708 (Core)   64位
Status信息如下:(以下信息是没有做Reload之前的信息截图)Workerman status
----------------------------------------------GLOBAL STATUS----------------------------------------------------
Workerman version:3.5.14 PHP version:7.2.12
start time:2019-02-16 22:09:03 run 0 days 10 hours
load average: 8.87, 9, 9 event-loop:\Workerman\Events\Event
2 workers 11 processes
worker_name exit_status exit_count
FileMonitor 0 0
commonBusinessworker 0 0
----------------------------------------------PROCESS STATUS---------------------------------------------------
pid memory listening worker_name connections send_fail timers total_request qps status
10845 2M none FileMonitor 0 0 0 0 0
10846 4M none commonBusinessworker 5 0 0 10 0
10847 4M none commonBusinessworker 5 0 0 4 0
10848 4M none commonBusinessworker 5 0 0 1 0
10849 4M none commonBusinessworker 5 0 0 4 0
10850 4M none commonBusinessworker 5 0 0 101 0
10851 4M none commonBusinessworker 5 0 0 6 0
10852 4M none commonBusinessworker 5 0 0 16 0
10853 4M none commonBusinessworker 5 0 0 1 0
10854 4M none commonBusinessworker 5 0 0 7 0
10855 4M none commonBusinessworker 5 0 0 1 0
----------------------------------------------PROCESS STATUS---------------------------------------------------
Summary 42M - - 50 0 0 151 0 上面Status截图之后,我Reload一下,异步任务就立刻有效执行,日志详情如下:

2019-02-16 20:34:50 pid:10680 Workerman stopping ...
2019-02-16 20:34:50 pid:10680 Workerman has been stopped
2019-02-16 20:34:50 pid:10692 Workerman stop success
2019-02-16 20:35:42 pid:10707 Workerman status
2019-02-16 20:36:00 pid:10708 Workerman status
2019-02-16 20:36:23 pid:10710 Workerman status
2019-02-16 21:18:33 pid:10771 Workerman status
2019-02-16 21:19:25 pid:10774 Workerman status
2019-02-16 22:09:03 pid:10842 Workerman restart
2019-02-16 22:09:03 pid:10842 Workerman is stopping ...
2019-02-16 22:09:03 pid:10694 Workerman stopping ...
2019-02-16 22:09:03 pid:10694 Workerman has been stopped
2019-02-16 22:09:03 pid:10842 Workerman stop success
2019-02-16 22:09:19 pid:10856 Workerman status
2019-02-17 08:18:32 pid:13297 Workerman status
2019-02-17 08:26:18 pid:13451 Workerman reload
2019-02-17 08:26:18 pid:10844 Workerman reloading

  • 暂无评论
walkor

从status里看BusinessWorker看是开了4个Gateway进程,如果是4个Gateway进程那么BusinessWorker看起来status是正常的。但是服务器的负载看起来有点高,top看下是什么进程造成这么高的负载。
 还有分部署部署的示意图贴下,哪些ip放了哪些服务,每个服务的配置贴下。另外描述下具体的业务是什么?
 

BusinessWorker过一段时间不执行任务会跟Gateway断开

怎么知道的BusinessWorker与Gateway 断开了?
 

无法执行后台任务

什么异步任务?如何确定的无法执行后台任务?还有能否贴下Events.php代码?
 
 

  • houniao506 2019-03-28

    经过一段时间的排查和观察,之所以不执行任务,是出在Redis和Mysql的自动断开的原因,他们的引用写在onWorkerStart里。后来改掉了就好了
    感谢大大的及时回复!

年代过于久远,无法发表回答
🔝