gearman分布式任务调度系统方少森@百度一、总体介绍gearman是一个分布式任务分发调度框架,支持多语言、并发的任务执行,支持负载均衡。
gearman具有如下特点:1、开源2、支持多语言接口:php、perl、python、C 等;3、灵活:不必拘泥与特定的模式,可以灵活使用分布式框架,如map/reduce;4、速度快,开销小5、可嵌入,很轻6、无单点。
二、gearman运行机制简介gearman包含3个基本组件:client、worker和job server。
client - 创建要运行的任务,并提交给job server。
job-server - 寻找最合适的worker,并提交任务给worker。
worker - 接收job server的任务,执行并返回结果,结果通过job server返回给client。
从图中可以看出,client和worker均为gearman提供的api。
三、一个简单的实例一个基于P HP的实例,功能是用于反转字符串,client代码:job server收到任务请求后,会寻找一个能够运行“reverse”的worker执行该任务。
worker上的代码如下:上述事例的运行时序图如下:四、三角色的工作流程一次正常的Gearman任务执行流程如上图所示:1. worker向Gearman Server注册自身可以执行的功能2. worker尝试获取一个任务3. server通告worker暂无任务4. worker通告server:“我先睡会,有活干时再叫醒我”5. client向server发起任务请求6. server唤醒可以完成这项工作的worker(可能会唤醒多个woker)7. worker向server发起“饥饿”请求,尝试获得一个任务8. server选定一个worker,将该任务分配下去9. 通告client:“我安排别人处理你的请求了,耐心等待吧”10. worker辛苦工作一段时间后,向server通告“干完了”11. server将结果反馈给用户说明:1. 任务分类:按优先级分:普通(SUBMIT_JOB),高(SUBMIT_JOB_HIGH),低(SUBMIT_JOB_LOW)∙按执行方式分:普通(_JOB_HIGH,_JOB_LOW),后台(_JOB_HIGH_BG,_JOB_LOW_BG)——最大区别在于,client可以跟踪前台任务的工作状态,而不能跟踪BG任务2. 任务工作状态的通告(worker-->server-->client):∙WORK_DATA:∙WORK_WARNING∙WORK_STATUS对于长任务,worker应该每隔一段时间通告任务状态∙WORK_COMP LETE∙WORK_FAIL∙WORK_E XCEPTION3. Server监控Gearman有“Administrative P rotocol”专门用于对Gearman Server的监控,主要涉及以下几方面:status:所注册职能分类,worker总数目,处于工作状态的worker数目,可用worker数目等w orker的详细信息:所注册功能、IPserver的缓存任务最大队列长度:可以被查询也可以被设定五、通信协议1、二进制packet格式所有请求和返回均为二进制数据包,包含header和可选的数据。
header包含如下字段:4 byte magic code - This is either "\0REQ" for requests or "\0RES" for responses.4 byte type - A big-endian (network-order) integer containingan enumerated packet type. Possible values are:# Name Magic Type1 CAN_DO REQ Worker2 CANT_DO REQ Worker3 RESET_ABILITIES REQ Worker4 PRE_SLEEP REQ Worker5 (unused) - -6 NOOP RES Worker7 SUBMIT_JOB REQ Client8 JOB_CREATED RES Client9 GRAB_JOB REQ Worker10 NO_JOB RES Worker11 JOB_ASSIGN RES Worker12 WORK_STATUS REQ WorkerRES Client13 WORK_COMPLETE REQ WorkerRES Client14 WORK_FAIL REQ WorkerRES Client15 GET_STATUS REQ Client16 ECHO_REQ REQ Client/Worker17 ECHO_RES RES Client/Worker18 SUBMIT_JOB_BG REQ Client19 ERROR RES Client/Worker20 STATUS_RES RES Client21 SUBMIT_JOB_HIGH REQ Client22 SET_CLIENT_ID REQ Worker23 CAN_DO_TIMEOUT REQ Worker24 ALL_YOURS REQ Worker25 WORK_EXCEPTION REQ WorkerRES Client26 OPTION_REQ REQ Client/Worker27 OPTION_RES RES Client/Worker28 WORK_DATA REQ WorkerRES Client29 WORK_WARNING REQ WorkerRES Client30 GRAB_JOB_UNIQ REQ Worker31 JOB_ASSIGN_UNIQ RES Worker32 SUBMIT_JOB_HIGH_BG REQ Client33 SUBMIT_JOB_LOW REQ Client34 SUBMIT_JOB_LOW_BG REQ Client35 SUBMIT_JOB_SCHED REQ Client36 SUBMIT_JOB_EPOCH REQ Client4 byte size - A big-endian (network-order) integer containingthe size of the data being sent after the header.data域中的参数之间通过一个字节的NULL分割,最后一个参数的界定范围为NULL分隔符之后到data域总长度之间的数据。
参数的长度不能超过64字节(包括NULL分隔符)。
client/w orker 与job server交互packet格式如下:Client/Worker Requests----------------------These request types may be sent by either a client or a worker:ECHO_REQWhen a job server receives this request, it simply generates aECHO_RES packet with the data. This is primarily used for testingor debugging.Arguments:- Opaque data that is echoed back in response.Client/Worker Responses-----------------------These response types may be sent to either a client or a worker:ECHO_RESThis is sent in response to a ECHO_REQ request. The server doesn't look at or modify the data argument, it just sends it back.Arguments:- Opaque data that is echoed back in response.ERRORThis is sent whenever the server encounters an error and needsto notify a client or worker.Arguments:- NULL byte terminated error code string.- Error text.Client Requests---------------These request types may only be sent by a client:SUBMIT_JOB, SUBMIT_JOB_BG,SUBMIT_JOB_HIGH, SUBMIT_JOB_HIGH_BG,SUBMIT_JOB_LOW, SUBMIT_JOB_LOW_BGA client issues one of these when a job needs to be run. Theserver will then assign a job handle and respond with a JOB_CREATED packet.If on of the BG versions is used, the client is not updated withstatus or notified when the job has completed (it is detached).The Gearman job server queue is implemented with three levels:normal, high, and low. Jobs submitted with one of the HIGH versions always take precedence, and jobs submitted with the normal versions take precedence over the LOW versions.Arguments:- NULL byte terminated function name.- NULL byte terminated unique ID.- Opaque data that is given to the function as an argument.SUBMIT_JOB_SCHEDJust like SUBMIT_JOB_BG, but run job at given time instead ofimmediately. This is not currently used and may be removed.Arguments:- NULL byte terminated function name.- NULL byte terminated unique ID.- NULL byte terminated minute (0-59).- NULL byte terminated hour (0-23).- NULL byte terminated day of month (1-31).- NULL byte terminated month (1-12).- NULL byte terminated day of week (0-6, 0 = Monday).- Opaque data that is given to the function as an argument.SUBMIT_JOB_EPOCHJust like SUBMIT_JOB_BG, but run job at given time instead ofimmediately. This is not currently used and may be removed.Arguments:- NULL byte terminated function name.- NULL byte terminated unique ID.- NULL byte terminated epoch time.- Opaque data that is given to the function as an argument.GET_STATUSA client issues this to get status information for a submitted job.Arguments:- Job handle that was given in JOB_CREATED packet.OPTION_REQA client issues this to set an option for the connection in thejob server. Returns a OPTION_RES packet on success, or an ERROR packet on failure.Arguments:- Name of the option to set. Possibilities are:* "exceptions" - Forward WORK_EXCEPTION packets to the client.Client Responses----------------These response types may only be sent to a client:JOB_CREATEDThis is sent in response to one of the SUBMIT_JOB* packets. Itsignifies to the client that a the server successfully receivedthe job and queued it to be run by a worker.Arguments:- Job handle assigned by server.WORK_DATA, WORK_WARNING, WORK_STATUS, WORK_COMPLETE, WORK_FAIL, WORK_EXCEPTIONFor non-background jobs, the server forwards these packets fromthe worker to clients. See "Worker Requests" for more information and arguments.STATUS_RESThis is sent in response to a GET_STATUS request. This is used byclients that have submitted a job with SUBMIT_JOB_BG to see if the job has been completed, and if not, to get the percentage complete.Arguments:- NULL byte terminated job handle.- NULL byte terminated known status, this is 0 (false) or 1 (true).- NULL byte terminated running status, this is 0 (false) or 1(true).- NULL byte terminated percent complete numerator.- Percent complete denominator.OPTION_RESSuccessful response to the OPTION_REQ request.Arguments:- Name of the option that was set, see OPTION_REQ for possibilities.Worker Requests---------------These request types may only be sent by a worker:CAN_DOThis is sent to notify the server that the worker is able toperform the given function. The worker is then put on a list to bewoken up whenever the job server receives a job for that function.Arguments:- Function name.CAN_DO_TIMEOUTSame as CAN_DO, but with a timeout value on how long the jobis allowed to run. After the timeout value, the job server willmark the job as failed and notify any listening clients.Arguments:- NULL byte terminated Function name.- Timeout value.CANT_DOThis is sent to notify the server that the worker is no longerable to perform the given function.Arguments:- Function name.RESET_ABILITIESThis is sent to notify the server that the worker is no longerable to do any functions it previously registered with CAN_DO orCAN_DO_TIMEOUT.Arguments:- None.PRE_SLEEPThis is sent to notify the server that the worker is about tosleep, and that it should be woken up with a NOOP packet if ajob comes in for a function the worker is able to perform.Arguments:- None.GRAB_JOBThis is sent to the server to request any available jobs on thequeue. The server will respond with either NO_JOB or JOB_ASSIGN, depending on whether a job is available.Arguments:- None.GRAB_JOB_UNIQJust like GRAB_JOB, but return JOB_ASSIGN_UNIQ when there is a job.Arguments:- None.WORK_DATAThis is sent to update the client with data from a running job. Aworker should use this when it needs to send updates, send partialresults, or flush data during long running jobs. It can also beused to break up a result so the worker does not need to buffer the entire result before sending in a WORK_COMPLETE packet.Arguments:- NULL byte terminated job handle.- Opaque data that is returned to the client.WORK_WARNINGThis is sent to update the client with a warning. It acts justlike a WORK_DATA response, but should be treated as a warning instead of normal response data.Arguments:- NULL byte terminated job handle.- Opaque data that is returned to the client.WORK_STATUSThis is sent to update the server (and any listening clients)of the status of a running job. The worker should send theseperiodically for long running jobs to update the percentagecomplete. The job server should store this information so a client who issued a background command may retrieve it later with a GET_STATUS request.Arguments:- NULL byte terminated job handle.- NULL byte terminated percent complete numerator.- Percent complete denominator.WORK_COMPLETEThis is to notify the server (and any listening clients) thatthe job completed successfully.Arguments:- NULL byte terminated job handle.- Opaque data that is returned to the client as a response.WORK_FAILThis is to notify the server (and any listening clients) thatthe job failed.Arguments:- Job handle.WORK_EXCEPTIONThis is to notify the server (and any listening clients) thatthe job failed with the given exception.Arguments:- NULL byte terminated job handle.- Opaque data that is returned to the client as an exception.SET_CLIENT_IDThis sets the worker ID in a job server so monitoring and reporting commands can uniquely identify the various workers, and different connections to job servers from the same worker.Arguments:- Unique string to identify the worker instance.ALL_YOURSNot yet implemented. This looks like it is used to notify a jobserver that this is the only job server it is connected to, soa job can be given directly to this worker with a JOB_ASSIGN andno worker wake-up is required.Arguments:- None.Worker Responses----------------These response types may only be sent to a worker:NOOPThis is used to wake up a sleeping worker so that it may grab apending job.Arguments:- None.NO_JOBThis is given in response to a GRAB_JOB request to notify theworker there are no pending jobs that need to run.Arguments:- None.JOB_ASSIGNThis is given in response to a GRAB_JOB request to give the workerinformation needed to run the job. All communication about thejob (such as status updates and completion response) should usethe handle, and the worker should run the given function withthe argument.Arguments:- NULL byte terminated job handle.- NULL byte terminated function name.- Opaque data that is given to the function as an argument.JOB_ASSIGN_UNIQThis is given in response to a GRAB_JOB_UNIQ request and actsjust like JOB_ASSIGN but with the client assigned unique ID.Arguments:- NULL byte terminated job handle.- NULL byte terminated function name.- NULL byte terminated unique ID.- Opaque data that is given to the function as an argument.2、管理协议job server支持文本的命令,用于获取信息和执行管理任务。