Understanding GlusterFS CLI Code – Part 1

GlusterFS CLI code follows client-server architecture, we should keep that mind while trying understand the CLI framework i.e. “glusterd” acts as  the server and gluster binary (i.e. /usr/sbin/gluster) acts as the client.

In this write up I have taken “gluster volume create” command and will provide code snippets, gdb back traces and Wireshark network traces .

  • All function calls start when  “gluster volume create <volume-name> <brick1> <brick2>” is entered on the command line.
  • “gluster” i.e. main() in cli.c  process the command line input and sends it to glusterd with relevent callback function information as mentioned below.
  • To be specific, gf_cli_create_volume() in cli/src/cli-rpc-ops.c sends the request to glusterd  with call back function information i.e. gf_cli_create_volume_cbk(). For more information look at the below code snippet  and also the gdb back trace i.e. #bt_1 (check #bt_1 below)
    • ret = cli_to_glusterd (&req, frame, gf_cli_create_volume_cbk, (xdrproc_t) xdr_gf_cli_req, dict, GLUSTER_CLI_CREATE_VOLUME, this, cli_rpc_prog, NULL);
    • CLI contacts glusterd on localhost:24007 as glusterd’s management port is 24007 for TCP.
  • glusterd uses the string passed in the above call back i.e. “GLUSTER_CLI_CREATE_VOLUME” to find out the relevant function call so that it can take the execution ahead.
  • Once the information is sent to glusterd, client just waits for the reply from it. The wait happens in “event_dispatch (ctx->event_pool);” function main():cli.c. #bt_2

There are some other important functions (in the client side in the cli framework) to checkout. If you are debugging any cli issue, there is high probability you will come across these functions. Below are those functions

  • cli_cmd_volume_create_cbk() :
    • Call to gf_cli_create_volume() goes from here and the mechanism to find gf_cli_create_volume() is little different here.
    • Check the structure of rpc_clnt_procedure_t and use of it in cli_cmd_volume_create_cbk() i.e. “proc = &cli_rpc_prog->proctable[GLUSTER_CLI_CREATE_VOLUME];”
  • parse_cmdline()  and cli_opt_parse() : These functions parse the command line input

GDB backtraces of function calls involved in the client side of the framework:

#bt_1:

Breakpoint 1, gf_cli_create_volume (frame=0x664544, this=0x3b9ac83600, data=0x679e48) at cli-rpc-ops.c:3240
3240            gf_cli_req              req = {{0,}};
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.5.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.6.x86_64 libcom_err-1.41.12-14.el6_4.4.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 libxml2-2.7.6-12.el6_4.1.x86_64 ncurses-libs-5.7-3.20090208.el6.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 readline-6.0-4.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  gf_cli_create_volume (frame=0x664544, this=0x3b9ac83600, data=0x679e48) at cli-rpc-ops.c:3240
##1  0x0000000000411587 in cli_cmd_volume_create_cbk (state=0x7fffffffe270, word=<value optimized out>, words=<value optimized out>,
#    wordcount=<value optimized out>) at cli-cmd-volume.c:410
#    #2  0x000000000040aa8b in cli_cmd_process (state=0x7fffffffe270, argc=5, argv=0x7fffffffe460) at cli-cmd.c:140
#    #3  0x000000000040a510 in cli_batch (d=<value optimized out>) at input.c:34
#    #4  0x0000003b99a07851 in start_thread () from /lib64/libpthread.so.0
#    #5  0x0000003b996e894d in clone () from /lib64/libc.so.6

#bt_2:

#0  gf_cli_create_volume_cbk (req=0x68a57c, iov=0x68a5bc, count=1, myframe=0x664544) at cli-rpc-ops.c:762
#1  0x0000003b9b20dd85 in rpc_clnt_handle_reply (clnt=0x68a390, pollin=0x6a3f20) at rpc-clnt.c:772
#2  0x0000003b9b20f327 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x68a3c0, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:905
#3  0x0000003b9b20ab78 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:512
#4  0x00007ffff711fd86 in socket_event_poll_in (this=0x6939b0) at socket.c:2119
#5  0x00007ffff712169d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x6939b0, poll_in=1, poll_out=0, poll_err=0) at socket.c:2229
#6  0x0000003b9aa62327 in event_dispatch_epoll_handler (event_pool=0x662710) at event-epoll.c:384
#7  event_dispatch_epoll (event_pool=0x662710) at event-epoll.c:445
#8  0x0000000000409891 in main (argc=<value optimized out>, argv=<value optimized out>) at cli.c:666

Code flow for “volume create” command in “glusterd” i.e. server side of cli framework

  • As mentioned in the cli client’s code flow, string “GLUSTER_CLI_CREATE_VOLUME” helps glusterd to find out the relevant function for the command . To see how it is done check structure gd_svc_cli_actors in glusterd-handler.c . I have also copied a small snippet of it.
    • rpcsvc_actor_t gd_svc_cli_actors[ ] = { [GLUSTER_CLI_PROBE]   = { “CLI_PROBE”, GLUSTER_CLI_PROBE, glusterd_handle_cli_probe, NULL, 0, DRC_NA}, [GLUSTER_CLI_CREATE_VOLUME]    = { “CLI_CREATE_VOLUME”, GLUSTER_CLI_CREATE_VOLUME,    glusterd_handle_create_volume, NULL, 0, DRC_NA},
  • Hence the call goes like this glusterd_handle_create_volume-> __glusterd_handle_create_volume
  • In __glusterd_handle_create_volume() all required validations are done e.g.: if the  volume with same name already exists or brick is on a separate partition or root partition, number of bricks. The gfid* for the volume is also generated here.
  • Another important function is gd_sync_task_begin(). May be I will go to details of this function in future write-up. Because as of now I dont understand it completely.
  • Once glusterd creates the volume , it sends the data back to the cli client. This happens in glusterd-rpc-ops.c:glusterd_op_send_cli_response()
    • glusterd_to_cli (req, cli_rsp, NULL, 0, NULL, xdrproc, ctx);

Network traces captured during the “create volume” command:

::1 -> ::1 Gluster CLI 292 V2 CREATE_VOLUME Call
node1 -> node2 GlusterD Management 200 V2 CLUSTER_LOCK Call
node2->node1 GlusterD Management 168 V2 CLUSTER_LOCK Reply (Call In
node1 -> node2 GlusterD Management 540 V2 STAGE_OP Call
node2 -> node1 GlusterD Management 184 V2 STAGE_OP Reply
127.0.0.1 -> 127.0.0.1 GlusterFS Callback 112 [TCP Previous segment
127.0.0.1 -> 127.0.0.1 GlusterFS Handshake 168 V2 GETSPEC Call
127.0.0.1 -> 127.0.0.1 GlusterFS Callback 112 [TCP Previous segment
127.0.0.1 -> 127.0.0.1 GlusterFS Handshake 160 V2 GETSPEC Call
node1 -> node2 GlusterFS Callback 112 V1 FETCHSPEC Call
::1 -> ::1 GlusterFS Callback 132 V1 FETCHSPEC Call
::1 -> ::1 GlusterFS Handshake 192 V2 GETSPEC Call
::1 -> ::1 GlusterFS Callback 132 [TCP Previous segment
::1 -> ::1 GlusterFS Callback 132 [TCP Previous segment
::1 -> ::1 GlusterFS Callback 132 V1 FETCHSPEC Call
node1 -> node2 GlusterD Management 540 V2 COMMIT_OP Call
127.0.0.1 -> 127.0.0.1 GlusterFS Handshake 984 V2 GETSPEC Reply
127.0.0.1 -> 127.0.0.1 GlusterFS Handshake 1140 V2 GETSPEC Reply
::1 -> ::1 GlusterFS Handshake 1436 V2 GETSPEC Reply
::1 -> ::1 GlusterFS Handshake 180 V2 GETSPEC Call
::1 -> ::1 GlusterFS Handshake 1160 V2 GETSPEC Reply
::1 -> ::1 GlusterFS Handshake 188 V2 GETSPEC Call
::1 -> ::1 GlusterFS Handshake 1004 V2 GETSPEC Reply
node2 -> node1 GlusterFS Callback 112 V1 FETCHSPEC Call
node2 -> node1 GlusterD Management 184 V2 COMMIT_OP Reply
node1 -> node2 GlusterD Management 200 V2 CLUSTER_UNLOCK Call
node2 -> node1 GlusterD Management 168 V2 CLUSTER_UNLOCK Reply
::1 -> ::1 Gluster CLI 464 V2 CREATE_VOLUME Reply

Below function calls happened during the volume create as seen in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log. This has been collected after running glusterd in DEBUG mode.

[glusterd-volume-ops.c:69:__glusterd_handle_create_volume]
[glusterd-utils.c:412:glusterd_check_volume_exists]
[glusterd-utils.c:155:glusterd_lock]
[glusterd-utils.c:412:glusterd_check_volume_exists]
[glusterd-utils.c:597:glusterd_brickinfo_new]
[glusterd-utils.c:659:glusterd_brickinfo_new_from_brick]
[glusterd-utils.c:457:glusterd_volinfo_new]
[glusterd-utils.c:541:glusterd_volume_brickinfos_delete]
[store.c:433:gf_store_handle_destroy]
[glusterd-utils.c:571:glusterd_volinfo_delete]
[glusterd-utils.c:597:glusterd_brickinfo_new]
[glusterd-utils.c:659:glusterd_brickinfo_new_from_brick]
[glusterd-utils.c:457:glusterd_volinfo_new]
[glusterd-utils.c:541:glusterd_volume_brickinfos_delete]
[store.c:433:gf_store_handle_destroy]

***************************************************
[glusterd-utils.c:5244:glusterd_hostname_to_uuid]
[glusterd-utils.c:878:glusterd_volume_brickinfo_get]
[glusterd-utils.c:887:glusterd_volume_brickinfo_get]
[glusterd-op-sm.c:4284:glusterd_op_commit_perform]
[glusterd-op-sm.c:3404:glusterd_op_modify_op_ctx]
[glusterd-rpc-ops.c:193:glusterd_op_send_cli_response]
[socket.c:492:__socket_rwv]
[socket.c:2235:socket_event_handler]

gfid*: it is an uuid number maintained by glusterfs. GlusterFS uses gfid extensively and a separate write-up on gfid would be more justifiable.

I hope it will give some insights to anyone trying to understand GlusterFS cli framework.

2 thoughts on “Understanding GlusterFS CLI Code – Part 1

  1. Hi Lalatendu,
    Greatly appreciate your work. This article will help understand the GlusterFS finer details.
    Thanks for posting this.
    Can you explain how to debug the EC xlator using gdb ?

    Best regards

Leave a comment