Bug 1004546 - peer probe can deadlock in "Sent and Received peer request" for both servers after server build
Summary: peer probe can deadlock in "Sent and Received peer request" for both servers ...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kaushal
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-04 21:53 UTC by Todd Stansell
Modified: 2017-08-15 14:41 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-15 14:41:27 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
admin01 logs from failure after kickstart (74.12 KB, text/plain)
2013-09-04 21:54 UTC, Todd Stansell
no flags Details
admin02 logs from failure after kickstart (130.12 KB, text/plain)
2013-09-04 21:54 UTC, Todd Stansell
no flags Details
admin01 logs from after removing peer state and restarting glusterd (56.68 KB, text/plain)
2013-09-04 21:55 UTC, Todd Stansell
no flags Details
admin02 logs from after removing peer state and restarting glusterd (135.67 KB, text/plain)
2013-09-04 21:56 UTC, Todd Stansell
no flags Details
admin01 logs from success after kickstart (48.21 KB, text/plain)
2013-09-04 21:56 UTC, Todd Stansell
no flags Details
admin02 logs from success after kickstart (126.49 KB, text/plain)
2013-09-04 21:56 UTC, Todd Stansell
no flags Details

Description Todd Stansell 2013-09-04 21:53:13 UTC
Description of problem:

Occasionally after rebuilding a node in a replica 2 cluster, the initial peer probe from the rebuilt node will cause both peers to be in a "Sent and Received peer request" state, never exchanging volume information and letting the rebuilt node move to the "Peer in Cluster" state. 

The only way out of this I've found is to stop glusterd on both nodes, remove the state= parameter from the /var/lib/glusterd/peers/<uuid> file and then start glusterd up again.  After starting glusterd, the negotiation between the two start from the "Establishing Connection" state and things work as expected.

Version-Release number of selected component (if applicable):
3.4.0

How reproducible:
It seems to happen every time I change which host is being rebuilt, but not if I rebuild the same node.  I'm not 100% sure of this pattern, but it seems this way.

Steps to Reproduce:
1. begin with a replica 2 cluster
2. shut down services and kickstart one server
3. restore previous uuid in /var/lib/glusterd/glusterd.info
4. start glusterd
5. run: gluster peer probe $peer
6. restart glusterd

Actual results:

peer status shows both peers in "Sent and Received peer request" as they both seem to wait for an ACC from the other side.

Expected results:

peer should end up in "Peer in Cluster" state, with volume information exchanged and bricks started.

Additional info:

In our situation, we've written kickstart scripts to automate the peer probe and rejoining of the cluster.  During kickstart we preserve the uuid of the server (step 3) and then set up an init script to run soon after glusterd starts upon the first boot.  The script that gets generated in our test while rebuilding admin02.mgmt is as follows (to see our exact steps):

#!/bin/bash
# initialize glusterfs config
#

# Source function library.
. /etc/init.d/functions

me=admin02.mgmt
peer=admin01.mgmt
gluster peer probe $peer
for i in 1 2 3 4 5; do
    echo -n "Checking for Peer in Cluster .. $i .. "
    out=`gluster peer status 2>/dev/null | grep State:`
    echo $out
    if echo $out | grep "Peer in Cluster" >/dev/null; then
        break
    fi
    sleep 1
done
# restart glusterd after we've attempted a probe
service glusterd restart

for i in 1 2 3 4 5; do
    echo "Checking for volume info .. $i"
    out=`gluster volume info 2>/dev/null | grep -v "^No "`
    if [ -n "$out" ] ; then
        break
    fi
    sleep 1
done
#----------------------------------------------------

One of the failures we've observed showed the following on the console:

  Running /etc/rc3.d/S21glusterfs-init start
  peer probe: success
  Checking for Peer in Cluster .. 1 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 2 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 3 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 4 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 5 .. State: Accepted peer request (Connected)
  Stopping glusterd:[  OK  ]
  Starting glusterd:[  OK  ]
  Checking for volume info .. 1
  Checking for volume info .. 2
  Checking for volume info .. 3
  Checking for volume info .. 4
  Checking for volume info .. 5

After this, if we look at peer status, it shows both nodes in the "Sent and Received peer request" status.

One of the times this procedure works, we get the following output:

  Running /etc/rc3.d/S21glusterfs-init start
  peer probe: success
  Checking for Peer in Cluster .. 1 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 2 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 3 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 4 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 5 .. State: Accepted peer request (Connected)
  Stopping glusterd:[  OK  ]
  Starting glusterd:[  OK  ]
  Checking for volume info .. 1
  Checking for volume info .. 2

And at this point, it joins the cluster and starts the bricks.

I will include attachments of the etc-glusterfs-glusterd logs in DEBUG mode from both servers in 3 different situations to try to help understand what's going on.

  * The logs with -194603 suffix are from the failed attempt above to kickstart admin02.
  * The logs with -200414 are after I shut down glusterd on both nodes and removed state= from the peer files, causing them to start over and join the cluster.
  * The logs with -215614 are a second full kickstart of admin02 where it succeeded as expected.

The only pattern I can find to this is that when I switch which node is getting kickstarted it seems to fail all the time.  If I continue to kickstart the same node, it seems to continue to succeed.

Todd

Comment 1 Todd Stansell 2013-09-04 21:54:21 UTC
Created attachment 793876 [details]
admin01 logs from failure after kickstart

Comment 2 Todd Stansell 2013-09-04 21:54:55 UTC
Created attachment 793877 [details]
admin02 logs from failure after kickstart

Comment 3 Todd Stansell 2013-09-04 21:55:37 UTC
Created attachment 793878 [details]
admin01 logs from after removing peer state and restarting glusterd

Comment 4 Todd Stansell 2013-09-04 21:56:00 UTC
Created attachment 793879 [details]
admin02 logs from after removing peer state and restarting glusterd

Comment 5 Todd Stansell 2013-09-04 21:56:27 UTC
Created attachment 793880 [details]
admin01 logs from success after kickstart

Comment 6 Todd Stansell 2013-09-04 21:56:52 UTC
Created attachment 793881 [details]
admin02 logs from success after kickstart

Comment 7 Banio Carpenter 2014-10-10 20:17:07 UTC
I can confirm that this is still not fixed.  Although I had to change the line in the /var/lib/glusterd/<uuid> file from:
state=5
to
state=3
For it to work

gluster version: 3.5.2

OS: CentOS

Please let me know if any further information is needed.

Comment 8 Atin Mukherjee 2017-01-30 06:11:29 UTC
(In reply to Todd Stansell from comment #0)
> Description of problem:
> 
> Occasionally after rebuilding a node in a replica 2 cluster, the initial
> peer probe from the rebuilt node will cause both peers to be in a "Sent and
> Received peer request" state, never exchanging volume information and
> letting the rebuilt node move to the "Peer in Cluster" state. 
> 
> The only way out of this I've found is to stop glusterd on both nodes,
> remove the state= parameter from the /var/lib/glusterd/peers/<uuid> file and
> then start glusterd up again.  After starting glusterd, the negotiation
> between the two start from the "Establishing Connection" state and things
> work as expected.
> 
> Version-Release number of selected component (if applicable):
> 3.4.0
> 
> How reproducible:
> It seems to happen every time I change which host is being rebuilt, but not
> if I rebuild the same node.  I'm not 100% sure of this pattern, but it seems
> this way.
> 
> Steps to Reproduce:
> 1. begin with a replica 2 cluster
> 2. shut down services and kickstart one server
> 3. restore previous uuid in /var/lib/glusterd/glusterd.info

Why are we trying to restore previous UUID? If it's a fresh set up then you should retain the original UUIDs.

> 4. start glusterd
> 5. run: gluster peer probe $peer
> 6. restart glusterd
> 
> Actual results:
> 
> peer status shows both peers in "Sent and Received peer request" as they
> both seem to wait for an ACC from the other side.
> 
> Expected results:
> 
> peer should end up in "Peer in Cluster" state, with volume information
> exchanged and bricks started.
> 
> Additional info:
> 
> In our situation, we've written kickstart scripts to automate the peer probe
> and rejoining of the cluster.  During kickstart we preserve the uuid of the
> server (step 3) and then set up an init script to run soon after glusterd
> starts upon the first boot.  The script that gets generated in our test
> while rebuilding admin02.mgmt is as follows (to see our exact steps):
> 
> #!/bin/bash
> # initialize glusterfs config
> #
> 
> # Source function library.
> . /etc/init.d/functions
> 
> me=admin02.mgmt
> peer=admin01.mgmt
> gluster peer probe $peer
> for i in 1 2 3 4 5; do
>     echo -n "Checking for Peer in Cluster .. $i .. "
>     out=`gluster peer status 2>/dev/null | grep State:`
>     echo $out
>     if echo $out | grep "Peer in Cluster" >/dev/null; then
>         break
>     fi
>     sleep 1
> done
> # restart glusterd after we've attempted a probe
> service glusterd restart
> 
> for i in 1 2 3 4 5; do
>     echo "Checking for volume info .. $i"
>     out=`gluster volume info 2>/dev/null | grep -v "^No "`
>     if [ -n "$out" ] ; then
>         break
>     fi
>     sleep 1
> done
> #----------------------------------------------------
> 
> One of the failures we've observed showed the following on the console:
> 
>   Running /etc/rc3.d/S21glusterfs-init start
>   peer probe: success
>   Checking for Peer in Cluster .. 1 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 2 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 3 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 4 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 5 .. State: Accepted peer request
> (Connected)
>   Stopping glusterd:[  OK  ]
>   Starting glusterd:[  OK  ]
>   Checking for volume info .. 1
>   Checking for volume info .. 2
>   Checking for volume info .. 3
>   Checking for volume info .. 4
>   Checking for volume info .. 5
> 
> After this, if we look at peer status, it shows both nodes in the "Sent and
> Received peer request" status.
> 
> One of the times this procedure works, we get the following output:
> 
>   Running /etc/rc3.d/S21glusterfs-init start
>   peer probe: success
>   Checking for Peer in Cluster .. 1 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 2 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 3 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 4 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 5 .. State: Accepted peer request
> (Connected)
>   Stopping glusterd:[  OK  ]
>   Starting glusterd:[  OK  ]
>   Checking for volume info .. 1
>   Checking for volume info .. 2
> 
> And at this point, it joins the cluster and starts the bricks.
> 
> I will include attachments of the etc-glusterfs-glusterd logs in DEBUG mode
> from both servers in 3 different situations to try to help understand what's
> going on.
> 
>   * The logs with -194603 suffix are from the failed attempt above to
> kickstart admin02.
>   * The logs with -200414 are after I shut down glusterd on both nodes and
> removed state= from the peer files, causing them to start over and join the
> cluster.
>   * The logs with -215614 are a second full kickstart of admin02 where it
> succeeded as expected.
> 
> The only pattern I can find to this is that when I switch which node is
> getting kickstarted it seems to fail all the time.  If I continue to
> kickstart the same node, it seems to continue to succeed.
> 
> Todd

Comment 9 Atin Mukherjee 2017-08-08 15:43:55 UTC
Bump, can the needinfo be addressed?

Comment 10 Todd Stansell 2017-08-08 23:20:17 UTC
I can't provide anything more. I left that company years ago.


Note You need to log in before you can comment on or make changes to this bug.