Creating Custom Distributed Storage Engine

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Creating Custom Distributed Storage Engine

Jonathan Ellithorpe
Hi All,

I saw the documentation here:

https://dev.mysql.com/doc/internals/en/custom-engine.html

But I wasn't sure if this documentation was valid for the case where I want
to create a distributed, shared, storage backend for MySQL, where many
MySQL servers are all accessing the same storage backend.

Could someone please point me in the right direction? I'm not sure if
there's something special that goes into creating a distributed storage
engine plugin for MySQL (say, for Cassandra, where one intends to deploy
multiple MySQL servers that all access the same Cassandra cluster).

Thanks for any help,
Jonathan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Creating Custom Distributed Storage Engine

Mauritz Sundell
Hi Jonathan

On 2016-07-19 02:05, Jonathan Ellithorpe wrote:
> Hi All,
>
> I saw the documentation here:
>
> https://dev.mysql.com/doc/internals/en/custom-engine.html
>
> But I wasn't sure if this documentation was valid for the case where I want
> to create a distributed, shared, storage backend for MySQL, where many
> MySQL servers are all accessing the same storage backend.
The ndbcluster engine, which connects to a distributed storage, uses the
same storage engine interface as any storage engine.
So the referenced documentation should be valid, although there will be
complications.

>
> Could someone please point me in the right direction? I'm not sure if
> there's something special that goes into creating a distributed storage
> engine plugin for MySQL (say, for Cassandra, where one intends to deploy
> multiple MySQL servers that all access the same Cassandra cluster).
The storage engine interface have no mean to pass knowledge of other
MySQL servers connected to the same distributed data.
And the MySQL server do not need to know either.

Starting with a read only engine, the first complication is how each
MySQL server should know about tables accessible in the distributed storage.
For this one could let CREATE TABLE mean create a local table object on
the local server and map that do a table in the distributed storage,
exactly how is up to the engine implmentation.
DROP TABLE should then mean drop the mapping, not dropping the table in
the distributed storage.

If one want to be able to create and drop tables from MySQL in the
distributed storage it is probably wise to store the SQL-statement in
the distributed storage and ensure that other MySQL server uses the same
create statement for the table.  One might also implement some way for
handler to notify each other about tables created or dropped, and create
them from engine when notified, see ha_create_table_from_engine().

Supporting writes should not be a problem either, but some feature need
some extra thoughts either to application or storage engine implemention.

For example triggers on a distributed table will only be created locally
on the MySQL server the create trigger statement are issued on, and it
will not be passed to storage engine so the engine can not distribute
the create trigger.  This implies that application need to create the
same triggers on all MySQL servers if needed. Also triggers are executed
only on the MySQL issuing the triggering SQL-statement.  And updates not
going through a MySQL server will not trigger a MySQL trigger.

If you intend to use MySQL replication the storage engine must have some
way to subscribe on all relevant row changes from the distributed
storage and then inject them into the servers binlog, otherwise only
SQL-statement issued locally on the binlogging MySQL server will be
logged and replicated.  I believe this is not documented except in the code.

Skipping MySQL replication and distributing DDL writing a storage engine
for a distributed storage should be as writing one for a local storage.

Regards
Mauritz Sundell

>
> Thanks for any help,
> Jonathan
>


--
MySQL Cluster Mailing List
For list archives: http://lists.mysql.com/cluster
To unsubscribe:    http://lists.mysql.com/cluster

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Creating Custom Distributed Storage Engine

Jonathan Ellithorpe
In reply to this post by Jonathan Ellithorpe
Thanks for the information Mark, I haven't looked into Postgres FDW, not
familiar with SQL/MED.

Indeed the MySQL custom storage engine documentation is lacking. It doesn't
even talk about how to create a new index.

One concern I would have is if the MySQL top-half did something like
caching, instead of relying on the SE for that. If this were the case, then
MySQL wouldn't work in a distributed environment because writes by other
MySQL instances wouldn't get reflected in the cached copies of that data on
other sites. However, I can't find anywhere that says that it doesn't do
that.

MySQL Cluster does the job of being distributed, but it's not clear to me
if the MySQL Cluster top-half is actually the same as the normal MySQL
top-half with NDB beneath it... or if MySQL Cluster top-half is different
in order to support multiple sites... Anyone know about that?

Jonathan

On Wed, Jul 20, 2016 at 5:40 AM MARK CALLAGHAN <[hidden email]> wrote:

> There isn't much public discussion about engine internals in MySQL-land. I
> wish that weren't so. If that is a problem for you, have you considered
> Postgres FDW? There is also an engine API for Tarantool (tarantool.org)
> and given the low-latency performance in Tarantool that might be a good
> match for you.
>
> I wonder whether the documentation you reference has been used by anyone.
> I also wonder whether it was useful. I am not aware of people using it and
> I have worked on 2 custom storage engines. The handler API isn't easy so
> you will have to find a mailing list where your questions get answered. The
> MariaDB lists are active, the MySQL lists are not.
>
> If your engine implements the RocksDB API then you can start with the
> MyRocks engine.
>
> https://github.com/facebook/mysql-5.6/tree/webscalesql-5.6.27.75/storage/rocksdb
>
> There is integration with Cassandra in MariaDB. I have no experience with
> it:
> https://www.google.com/?ion=1&espv=2#q=mariadb%20cassandra
>
> For performance, you might want to push as much work as possible from the
> mysqld process to your distributed database. I don't know whether you can
> push down partial aggregation. You can push non-indexed predicates via
> engine condition pushdown and index condition pushdown might also be
> interesting:
>
> https://dev.mysql.com/doc/refman/5.7/en/condition-pushdown-optimization.html
>
> https://dev.mysql.com/doc/refman/5.7/en/index-condition-pushdown-optimization.html
>
> AFAIK, the MySQL top-half (above your engine) doesn't have to do much with
> row-locks so your engine doesn't have to change for that. But with DDL
> (add/drop table/index) there is locking done in the MySQL top-half. I am
> not sure how NDB/Cluster coordinates that between mysqld processes.
>
>
> On Mon, Jul 18, 2016 at 5:05 PM, Jonathan Ellithorpe <[hidden email]>
> wrote:
>
>> Hi All,
>>
>> I saw the documentation here:
>>
>> https://dev.mysql.com/doc/internals/en/custom-engine.html
>>
>> But I wasn't sure if this documentation was valid for the case where I
>> want
>> to create a distributed, shared, storage backend for MySQL, where many
>> MySQL servers are all accessing the same storage backend.
>>
>> Could someone please point me in the right direction? I'm not sure if
>> there's something special that goes into creating a distributed storage
>> engine plugin for MySQL (say, for Cassandra, where one intends to deploy
>> multiple MySQL servers that all access the same Cassandra cluster).
>>
>> Thanks for any help,
>> Jonathan
>>
>
>
>
> --
> Mark Callaghan
> [hidden email]
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Creating Custom Distributed Storage Engine

Mauritz Sundell
Hi Jonathan

On 2016-07-20 21:40, Jonathan Ellithorpe wrote:
> Thanks for the information Mark, I haven't looked into Postgres FDW, not
> familiar with SQL/MED.
IIUC SQL/MED version of FDW to not support remote DDL nor DML (even if
it include a notion of updatable remote tables), but Postgres version I
believe do support remote DML at least.

>
> Indeed the MySQL custom storage engine documentation is lacking. It doesn't
> even talk about how to create a new index.
Yes, I noted that too.
Not only are new things missing, but really old stuff like creating indexes.

Sadly I'm not versed in storage engine and the handler interface.
But IIUC indexes are passed in create_info to ha_create() and alter_info
to ha_inplace_alter_table().
ha_create are used for create table and copying alter table.
create index likely turn up to an inplace_alter_table.

I can currently do not give you better advise than check out the source :(
Some comments in sql/handler.h/cc and of course look in the source for
other storage engines.

And even if it might not help you now, file a documentation bug.
>
> One concern I would have is if the MySQL top-half did something like
> caching, instead of relying on the SE for that. If this were the case, then
There should not be caching in the "top-half", unless query cache is in
use.
> MySQL wouldn't work in a distributed environment because writes by other
> MySQL instances wouldn't get reflected in the cached copies of that data on
> other sites. However, I can't find anywhere that says that it doesn't do
> that.
ndbcluster also depends on this, so no caching.
But the question how to syncronize reads and writes will be up to the
underlying distributed database and storage engine logic.

>
> MySQL Cluster does the job of being distributed, but it's not clear to me
> if the MySQL Cluster top-half is actually the same as the normal MySQL
> top-half with NDB beneath it... or if MySQL Cluster top-half is different
> in order to support multiple sites... Anyone know about that?
Since ~5.0 MySQL Cluster forked MySQL and diverged, but the current
state is that servers from MySQL Cluster 7.5 and MySQL 5.7 have the same
code, with some minor exceptions.
Servers from older release series will differ more.
Most probably you can use ndbcluster engine from 7.5 also with a server
from 5.7.

Mauritz

>
> Jonathan
>
> On Wed, Jul 20, 2016 at 5:40 AM MARK CALLAGHAN <[hidden email]> wrote:
>
>> There isn't much public discussion about engine internals in MySQL-land. I
>> wish that weren't so. If that is a problem for you, have you considered
>> Postgres FDW? There is also an engine API for Tarantool (tarantool.org)
>> and given the low-latency performance in Tarantool that might be a good
>> match for you.
>>
>> I wonder whether the documentation you reference has been used by anyone.
>> I also wonder whether it was useful. I am not aware of people using it and
>> I have worked on 2 custom storage engines. The handler API isn't easy so
>> you will have to find a mailing list where your questions get answered. The
>> MariaDB lists are active, the MySQL lists are not.
>>
>> If your engine implements the RocksDB API then you can start with the
>> MyRocks engine.
>>
>> https://github.com/facebook/mysql-5.6/tree/webscalesql-5.6.27.75/storage/rocksdb
>>
>> There is integration with Cassandra in MariaDB. I have no experience with
>> it:
>> https://www.google.com/?ion=1&espv=2#q=mariadb%20cassandra
>>
>> For performance, you might want to push as much work as possible from the
>> mysqld process to your distributed database. I don't know whether you can
>> push down partial aggregation. You can push non-indexed predicates via
>> engine condition pushdown and index condition pushdown might also be
>> interesting:
>>
>> https://dev.mysql.com/doc/refman/5.7/en/condition-pushdown-optimization.html
>>
>> https://dev.mysql.com/doc/refman/5.7/en/index-condition-pushdown-optimization.html
>>
>> AFAIK, the MySQL top-half (above your engine) doesn't have to do much with
>> row-locks so your engine doesn't have to change for that. But with DDL
>> (add/drop table/index) there is locking done in the MySQL top-half. I am
>> not sure how NDB/Cluster coordinates that between mysqld processes.
>>
>>
>> On Mon, Jul 18, 2016 at 5:05 PM, Jonathan Ellithorpe <[hidden email]>
>> wrote:
>>
>>> Hi All,
>>>
>>> I saw the documentation here:
>>>
>>> https://dev.mysql.com/doc/internals/en/custom-engine.html
>>>
>>> But I wasn't sure if this documentation was valid for the case where I
>>> want
>>> to create a distributed, shared, storage backend for MySQL, where many
>>> MySQL servers are all accessing the same storage backend.
>>>
>>> Could someone please point me in the right direction? I'm not sure if
>>> there's something special that goes into creating a distributed storage
>>> engine plugin for MySQL (say, for Cassandra, where one intends to deploy
>>> multiple MySQL servers that all access the same Cassandra cluster).
>>>
>>> Thanks for any help,
>>> Jonathan
>>>
>>
>>
>> --
>> Mark Callaghan
>> [hidden email]
>>


--
MySQL Cluster Mailing List
For list archives: http://lists.mysql.com/cluster
To unsubscribe:    http://lists.mysql.com/cluster

Loading...