Mysql Cluster Node failure caused abort of transaction Errors

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Mysql Cluster Node failure caused abort of transaction Errors

Serhat Rıfat Demircan
Hi,

I have an mysql cluster which have 4 api nodes, 2 management nodes and 4
data nodes. Today, I was having problems while trying to connect database
and all queries was hanging at "Opening tables" state. After inspecting
logs i have found these erros on logs.

Api Node:

2015-08-20 19:44:14 15540 [Note] NDB Schema dist: Data node: 5 failed,
subscriber bitmask 00
2015-08-20 19:44:14 15540 [Note] NDB Schema dist: Data node: 6 failed,
subscriber bitmask 00
2015-08-20 19:44:14 15540 [Note] NDB Schema dist: Data node: 7 failed,
subscriber bitmask 00
2015-08-20 19:44:14 15540 [Note] NDB Schema dist: Data node: 8 failed,
subscriber bitmask 00
2015-08-20 19:44:14 15540 [Note] NDB Schema dist: cluster failure at epoch
3313124/17.
2015-08-20 19:44:14 15540 [Note] NDB Binlog: ndb tables initially read only
on reconnect.
2015-08-20 19:44:14 15540 [ERROR] /opt/mysql/server-5.6/bin/mysqld: Got
temporary error 4028 'Node failure caused abort of transaction' from
NDBCLUSTER
2015-08-20 19:44:14 15540 [ERROR] /opt/mysql/server-5.6/bin/mysqld: Sort
aborted: Got temporary error 4028 'Node failure caused abort of
transaction' from NDBCLUSTER
2015-08-20 19:44:14 15540 [ERROR] Got error 4010 when reading table
'./database_name/table'
2015-08-20 19:44:14 15540 [Note] NDB Binlog: cluster failure for
./database_name/table_name at epoch 3313124/17.

mysql> show processlists;

Id User Host db Command Time State Info
1 system user NULL Daemon 1497 Waiting for ndbcluster to start NULL

Data Node:

2015-08-20 19:44:14 [ndbd] ERROR -- c_gcp_list.seize() failed: gci:
14229759227592721 nodes:
0000000000000000000000000000040000000000000000000000000000001a00
2015-08-20 19:44:14 [ndbd] WARNING -- ACK wo/ gcp record (gci: 3313124/17)
ref: 0fa2000b from: 0fa2000b
2015-08-20 19:44:14 [ndbd] WARNING -- ACK wo/ gcp record (gci: 3313124/17)
ref: 0fa2000c from: 0fa2000c
2015-08-20 19:44:14 [ndbd] WARNING -- ACK wo/ gcp record (gci: 3313124/17)
ref: 0fa2008a from: 0fa2008a

After restarting api node, database has started to work normally. What
might cause "Node failure caused abort of transaction" error?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Mysql Cluster Node failure caused abort of transaction Errors

Hartmut Holzgraefe-4
Hi,

could you collect all node log files (mgmt, data, api) and config.ini and upload them to some public service?

Preferably by using the ndb_error_reporter tool that automates log collection and wraps everything up in a nice tarball?
--
Hartmut Holzgraefe, Principal Support Engineer (EMEA)
MariaDB Corporation | http://www.mariadb.com/

--
MySQL Cluster Mailing List
For list archives: http://lists.mysql.com/cluster
To unsubscribe:    http://lists.mysql.com/cluster

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Mysql Cluster Node failure caused abort of transaction Errors

Serhat Rıfat Demircan
Hi,

Data node config: https://gist.github.com/sdemircan/730fa49fcc14b4376c42
Api node config: https://gist.github.com/sdemircan/f9d230d32700b86564fd
Management node config:
https://gist.github.com/sdemircan/d6fbd54799daaae01bf2

Api Node logs: https://gist.github.com/sdemircan/2d62b1c92176de9de9d3

Data Node (192.168.141.162) logs:
https://gist.github.com/sdemircan/d0c97b82457a9c33deaa
Data Node (192.168.141.187) logs:
https://gist.github.com/sdemircan/3faa1e41367bc7655210

Management Node logs: https://gist.github.com/sdemircan/a026ac57757fafdafaa9

On management logs:
2015-08-20 19:44:14 [MgmtSrvr] INFO     -- Node 5: Disconnecting lagging
nodes '0000000000000000000000000000000000000000000000000000000000000200',
2015-08-20 19:44:14 [MgmtSrvr] WARNING  -- Node 5: Disconnecting node 9
because it has exceeded MaxBufferedEpochs (100 > 100), epoch 3313119/4

Seems exceeding MaxBufferedEpochs was the real cause?


Serhat

On Fri, Aug 21, 2015 at 12:15 PM, Hartmut Holzgraefe <[hidden email]>
wrote:

> Hi,
>
> could you collect all node log files (mgmt, data, api) and config.ini and
> upload them to some public service?
>
> Preferably by using the ndb_error_reporter tool that automates log
> collection and wraps everything up in a nice tarball?
> --
> Hartmut Holzgraefe, Principal Support Engineer (EMEA)
> MariaDB Corporation | http://www.mariadb.com/
>
> --
> MySQL Cluster Mailing List
> For list archives: http://lists.mysql.com/cluster
> To unsubscribe:    http://lists.mysql.com/cluster
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Mysql Cluster Node failure caused abort of transaction Errors

Serhat Rıfat Demircan
Maybe another problem bringed MaxBufferedEpochs to upper limit?

On Fri, Aug 21, 2015 at 3:06 PM, Serhat Rıfat Demircan <
[hidden email]> wrote:

> Hi,
>
> Data node config: https://gist.github.com/sdemircan/730fa49fcc14b4376c42
> Api node config: https://gist.github.com/sdemircan/f9d230d32700b86564fd
> Management node config:
> https://gist.github.com/sdemircan/d6fbd54799daaae01bf2
>
> Api Node logs: https://gist.github.com/sdemircan/2d62b1c92176de9de9d3
>
> Data Node (192.168.141.162) logs:
> https://gist.github.com/sdemircan/d0c97b82457a9c33deaa
> Data Node (192.168.141.187) logs:
> https://gist.github.com/sdemircan/3faa1e41367bc7655210
>
> Management Node logs:
> https://gist.github.com/sdemircan/a026ac57757fafdafaa9
>
> On management logs:
> 2015-08-20 19:44:14 [MgmtSrvr] INFO     -- Node 5: Disconnecting lagging
> nodes '0000000000000000000000000000000000000000000000000000000000000200',
> 2015-08-20 19:44:14 [MgmtSrvr] WARNING  -- Node 5: Disconnecting node 9
> because it has exceeded MaxBufferedEpochs (100 > 100), epoch 3313119/4
>
> Seems exceeding MaxBufferedEpochs was the real cause?
>
>
> Serhat
>
> On Fri, Aug 21, 2015 at 12:15 PM, Hartmut Holzgraefe <[hidden email]>
> wrote:
>
>> Hi,
>>
>> could you collect all node log files (mgmt, data, api) and config.ini and
>> upload them to some public service?
>>
>> Preferably by using the ndb_error_reporter tool that automates log
>> collection and wraps everything up in a nice tarball?
>> --
>> Hartmut Holzgraefe, Principal Support Engineer (EMEA)
>> MariaDB Corporation | http://www.mariadb.com/
>>
>> --
>> MySQL Cluster Mailing List
>> For list archives: http://lists.mysql.com/cluster
>> To unsubscribe:    http://lists.mysql.com/cluster
>>
>>
>
Loading...