Scimore Company Blog
Ramblings on databases

Distributed database: Installation

February 4, 2010 14:05 by scimore

Hi,

My name is Marius Slyzius and I will be posting a series of the blobs about distributed database:

1. Installing cluster.
2. Create schema. Replicated/Partitioned tables. Execute distributed queries. DQL query plans.
3. Designing "true" shared-nothing schema. Pros/Cons
4. Setup failover/failback

Today is about how to install the cluster.

ScimoreDB distributed is a database where data is partitioned or replicated among X nodes. The node is a database server what was joined to a cluster by SQL command "ALTER CLUSTER". The number of nodes can be 2^n-1 = {3,7,15,31,63,...}. This number of nodes needed to form the binary tree, where a  node can send requests to 2 its childs and the childs to 2 their childs. For example we have cluster of 7 nodes and the client sends the query to node 4. In such case, the node 4 will send the query to childs 3,5 , and, each child will send to its childs 3->1,2 and 5 ->6,7. Everything happens in parallel. It is  similar to map/reduce, except that the node 4 don't need to send the request to all nodes, only to 2 its childs. The advantage of what is that we can do  reduce in parallel as well. For example, "select count(*) from mytable" sent by client to the node 4. Node 4 will start counting rows and sends SQL to 3 and 5 nodes.  Node 3 will start counting rows and pass to 1 and 2 nodes to count. Nodes 1 and 2 will count rows , but don't send sql anymore, since there are no childs.  The same will happen with node 5. Now, when node finished counting rows, the aggregation phase starts. In parallel, node 3 and 5 will receive the  counted rows from their childs and sum it with local row count. Each node did aggregation of local rows in parallel, and aggregation of aggregated data from  (1,2,6,7) on 3 and 5 in parallel too. Finally 4 will aggregate its counted rows with the aggregations from 3 and 5 nodes.

ScimoreDB uses DQL language to define how distributed query is executed. SQL is always converted to DQL. Before executing DQL on distributed DB it 
will be optimized for distributed query, for example, select rows only from a single node or all, add merge aggregators, stream splitters, etc...
To see DQL for SQL, press show query plan button in query window in the manager.

So, how to create distributed DB? First, you need to install db servers. I have created 4 batch files (to install 3 servers, uninstall 3 servers , start 3,and stop 3). All DB servers will be managed on a single PC, however, it could be 3 different machines preferably. The batch files located in "c:\db" folder together with scimoredb.exe (install ScimoreDB version 3 on machine and copy exe from there).

Install.cmd:
mkdir node1
mkdir node2
mkdir node3
scimoredb.exe -startup/instance=0 -startup/mode=1 -startup/console=1 -net/endpoint="localhost:999"    -db/cachepages=5000 -db/systables="c:\db\node1" -db/syslog="c:\db\node1"  -db/data="c:\db\node1"
scimoredb.exe -startup/instance=1 -startup/mode=1 -startup/console=1 -net/endpoint="localhost:1000"  -db/cachepages=5000 -db/systables="c:\db\node2" -db/syslog="c:\db\node2"  -db/data="c:\db\node2"
scimoredb.exe -startup/instance=2 -startup/mode=1 -startup/console=1 -net/endpoint="localhost:1001"  -db/cachepages=5000 -db/systables="c:\db\node3" -db/syslog="c:\db\node3"  -db/data="c:\db\node3"

Parameters:
-startup/instance={0,1,2} - instance id is the registry key to separate startup parameters of each DB on the same pc. HKEY_LOCALMACHINE\Software\Scimore\Nodes\[0] ... [2]. Also, instance id is added to the end of DB service name. {scimoredb-0,scimoredb-1,scimoredb-2}. Note, instance id's has nothing to do with the nodes in the cluster. If you install databases on distinct pc's set instance=0 for all them.

-net/endpoint="localhost:999-1001" - for single DB server, used only port the database listens. "localhost" has no meaning yet. Database listens on all IP addresses. However, when you join databases to the cluster, each database will become node and connect to each using  net/endpoint host:port value. In my case all 3 runs on the same machine, so it’s fine. But, if nodes are on different machine you need to specify correct either IP:port or host:port. Use "host:port" if IP might changes. Using "IP:port" you can specify different IP for comm between nodes and client/node (f.x. infinity band over socket network) for faster communication between nodes than clients to node(s). If the node can't be accessed anymore you need to update [endpoint] value for this node on all nodes. The information is stored in "system.sysinstances" table(nothing to do with -startup/instance). Again, if at least one node inaccessible you will not be able to connect and update the table with new values. To resolve it, start each node with the following parameters (first kill/stop node) and from cmd:
>scimoredb.exe -startup/instance=0 -startup/console=1 -net/mode=1 -net/endpoint="localhost:999"
When node starts it will ignore cluster setup because of net/mode=1 parameter and start single server listening on 999 port. Then, connect with the manager and update table sysinstances in system catalog with correct IP/Host/Port values: update system.systables set [endpoint] = "newIP:port" where instanceid=x.  Here, " instanceid" is not instance we used to install, but it is the node ID. Repeat it for each node. When done, restart services and you should be able to connect again.

-db/cachepages=5000 - number of the data pages to cache. Each page is 8kb, more you add better it is. The value can be changed in the registry, followed by service restart.

Uninstall.cmd:
scimoredb.exe -startup/instance=0 -startup/mode=2 -startup/console=1
scimoredb.exe -startup/instance=1 -startup/mode=2 -startup/console=1
scimoredb.exe -startup/instance=2 -startup/mode=2 -startup/console=1

-startup/mode=2 - tells to uninstall and delete data files. Startup/mode=20 will uninstall, but leave data files. Note: systables will contain cluster info and later if you install on top new instance without data overwrite, you won't be able to connect to db, because it will "think" it is in the cluster and try to access to other nodes listed in sysinstances table.

Start.cmd: start 3 databases on local pc.
net start scimoredb-0
net start scimoredb-1
net start scimoredb-2

Stop.cmd:
net stop scimoredb-0
net stop scimoredb-1
net stop scimoredb-2

Lets make a cluster now:
1. Stop scimoredb-0 service. Run uninstall.cmd. Check if scimoredb.exe process is not running, if it does kill it and run script again. Because, I have installed scimoredb with MSI and it did installed scimoredb-0 (0-instance), so I uninstall it. I could just create new instances {1,2,and 3} and leave 0 running.
2. Run install.cmd. The script will create 3 databases and each database will place system tables under Node1/2/3 folder. Then install finished, verify if NodeX contains files and if services scimoredb-{0,1,2} created.
3. Run start.cmd. Check if 3 scimoredb.exe processes running. Start manager and verify the connection to each database.
4. With manager connect to localhost/ port 999(database 0 on my pc) and in query window execute SQL command:
alter cluster
(
add endpoint 'localhost:1000',
add endpoint 'localhost:1001'
)
NOTE: make sure endpoint 'localhost:1000' matches endpoint value used when installed databases (-net/endpoint="localhost:999...).

Restart the manager. Connect to any database. Under "Cluster" tree node in the manager you will see 3 servers for each database/node you connected. Your 3 databases now in a cluster and you are ready to create databases/tables.

If you have trouble to create cluster, ideas/suggestion please don't hesitate to contact me:

m a r i u s     a t     s c i m o r e     d o t     c o m

 


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Cannot connect with .NET provider

October 12, 2009 15:13 by scimore

We have had a few support mails regarding common issue when connecting to the server.

In order to connect, use "Server=myserver/ip..." instead of "Data Source=...." in the connection string. The "Data Source" is used to specify the embedded database path.  Here is a link for connection strings in scimoredb: http://www.scimore.com/howto/getstarted/connection-strings


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:
Categories:
Actions: E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed

Locking, concurrency and transaction isolation levels in v3.0

September 7, 2009 18:04 by scimore

As the new version approaches the final cut, we would like to highlight some of the features that were not documented yet:

Locking and multi version concurrency control (MVCC)
ScimoreDB uses locking to prevent readers to read uncommitted rows. By default, readers use MVCC and data modifications row level exclusive locks. Readers will only read last committed rows' version, if concurrent transaction modified row and not committed it yet.
Here can be potential issue with concurrent updates when index field(s) being updated (the row will be re-inserted). For example, let’s say reader transaction selects fields from the table based on indexed field field1, where field1 > 100. The query will look up the lowest row where field1 > 100 and continue reading all rows to the end of the index. At the same time, assume the update updates a single row where field1 = 500 to 50000. The database will mark the current row as deleted and insert new row much further in the index. So, when reader reach deleted row, it will read it, since modified transaction has not yet committed and when reach row with field1 = 50000, skip it, because transaction still not committed. Now, consider the transaction that modified row has committed right after reader has read field1=500. When reader will reach field1=50000, the row will be read too, since update transaction has been committed. In such case, the reader has read the same row twice: the old version field1=500 and the new field1=50000. Hence, this is not happens when update in place occurs (updating field that is not part of the index). To avoid such case, reader may add additional locking hints:
select * from mytable xlock row - turn off MVCC and use exclusive row level locks, like with update/delete/insert.
select * from mytable read consistent - when reader reach modified/new row that was not committed yet, it will wait for the transaction to commit.
Another solution is to use READ REPEATABLE transaction isolation level.


Transaction isolation levels.
The syntax either use:
begin tran <isolevel> ... commit/rollback

or

set transaction isolation level <isoLevel>; SQL command;

ScimoreDB supports 3 isolation levels (<isoLevel>):

  1. READ COMMITTED - this is default isolation level. Readers will read only committed rows. Using locking hints you can optimize how to deal with modified/new not committed row as described above. The update/insert/delete will always use row-level locking and if row is modified by concurrent transaction will wait for commit.
  2. READ REPEATABLE - the isolation level is similar to SNAPSHOT isolation level in SQL Server. When read repeatable transaction starts it will see all rows at the point it started and any further modifications will not be visible for the particular transaction. For example:
    1st transaction updated row from value 0 to 100, but not committed yet. When starts second READ REPEATABLE transaction, at this point the rows' value will be 0 if it will read, since the 1st transaction has not committed when 2nd started. 3rd transaction starts, modify another row, set value 1 to 101 and commit. Still, the 2nd transaction will not "see" the change. However, other new started transactions will see the change made by the 3rd transaction. If DB crash at this point, recovery will apply 3rd update and abort 1st transaction update, since it was not committed during crash.
    READ REPEATABLE uses row versioning to store old versions of modified rows located in "sysundo.dat" file. Therefore, “sysundo.dat” file may grow in size until commit.
    Using SET TRANSACTION ISOLATION LEVEL REPEATABLE READ will commit current transaction when the reader read all rows. For example, SQL batch:
    SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
    select * from mytable  - until the client read all rows, the read repeatable transaction will not commit and maintain row versioning in "sysundo.dat".
  3. SERIALIZABLE - only 1 transaction at the time.

 

How to shring sysundo.dat & sysredo.log.
Sysredo.dat file is the transaction log. It always grows until checkpoint. By default, the checkpoint occurs every 5 minutes (it can be changed in the configuration). So, for example, single transaction inserting/updating/deleting many rows may noticeably increase sysredo.dat file size. Even, if after checkpoint the file space will be re-used, it might be ok for servers, but unacceptable for the embedded database. In such case you may execute: TRUNCATE TRANSACTION LOG sql command to truncate “sysundo.dat” and “sysredo.log” files. Note, the truncate transaction is serializable, i.e. no other transaction will run until it done and, also, the truncate will wait for all pending transaction to commit without allowing new once to start.

Disk space savings in the new engine.
The new engine uses much less disk space, usually 4-8 times less to compare with v2.5. This has been achieved by compressing of internal rows' attributes and using new b+tree splitting algorithm that does not pre-allocate 50% (8kb) disk space on page splitting.

Why *.dat files size never change in file explorer?
ScimoreDB opens files in exlusive mode, without Windows file manager cache. Therefore, the file size change is invisible in explorer. To see the current size of the table you can use manager, right click on the table, choose alter table... and then click on statistics tab, or, re-start the database.


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:
Categories:
Actions: E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed

More features in 3.0 - news from the coding trenches

June 17, 2009 21:27 by scimore
Scimore version 3.0 is progressing at a steady rate. Thanks for all the feedback, feature request and bug reports - keep'em coming.

Based on customer requests, we decided to to add additional features to the version 3.0 release.

Existing new features in 3.0 (downloadable version).
  • Replication, one way (read only), bi-directional and filtering
  • New table format, minimizing space overhead.
  • Improved freetext search, controlling weighting of results & approximative results
Additional features for 3.0 (not released yet):
  • 64bit client (c++, ado.net)
  • 64bit server & embedded
  • Update in place (massive speedup for updates of updates)
  • Compressed table format used for redo log, and sys-undo logs with additional space overhead improvements...
  • Improved page splitting for less storage requirements
  • Speed CPU optimizations in core functions

Note that the new 3.0 branch is considered better than version 2.5.

We are currently testing, and we must say 3.0 looks very promising.
Speed is improved by 30-50% on average, while the storage requirements are up to 4 times less.

Regards,
The Scimore Team


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Replication changes since v3.0.1520

January 8, 2009 17:15 by scimore

Hi everyone,

The new version of 3.0 beta branch is no longer compatible with the previous 3.* versions.  The changes affect only replication and require an upgrade. If you don't use replication, you don't need to upgrade, otherwise, here is an upgrade steps: 

1. If merge replication used, synchronize those tables using the old syntax, so the publisher has the latest data.

2. Uninstall old and install new versions. You can get it from download page under 3.0 (Beta). Both subscriber and the publisher must be the latest versions.

3. Cleanup publisher and subscriber. Because [#histories] table structure has been changed you must to drop all replication related tables ([#histories], [tablename#history],[tablename#conflicts] and replicated tables [tablename] (only on subscriber(s)! ). Use the commands below, since DROP TABLE will fail:

For [#histories] in the database "test" (subscriber and publisher):


execute DQL
begin
exec_ddl ' DROP TABLE  [test.#histories] '
end

For each published/subscribed table in the "test" (subscriber and publisher):

execute DQL
begin
exec_ddl ' DROP TABLE  [test.mytable#history] '
end

execute DQL
begin
exec_ddl ' DROP TABLE  [test.mytable#conflicts] '
end

Note: depending of the replication type READ/MERGE [test.#history] and [test.#conflicts] may exists or not.

Restart database servers and then drop each subscribed table. (only on subscriber!):

DROP TABLE test.mytable

4. Subscribe again:

Use SUBSCRIBE SQL command from the previous blog "Testing replication".

New syncrhonization SQL commands:

To synchronize a single or the multiple tables:

SYNCRHONIZE TABLE mytable [where] ,  optional [where] used to apply filter conditions to subscribe to a subset of rows.

Example: SYNCHRONIZE TABLE mytable where id between 0 and 100

Syncrhonize more than one table:

BEGIN TRAN
SYNCHRONIZE TABLE mytable1 [where clause]
SYNCHRONIZE TABLE mytable2 [where clause]
COMMIT

To synchronize entire database without [where] clause, on the subscriber invoke the following command:

SYNCHRONIZE DATABASE mydb

The command will syncrhonize all subscribed tables in the "mydb" database.

The subscribe/unsubscribe commands remain unchanged. For the syntax read the "Testing Replication" blog entry. 

 

Thanks, Scimore Team


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:
Categories:
Actions: E-mail | Permalink | Comments (2) | Comment RSSRSS comment feed

3.0 beta ready for download

November 28, 2008 12:40 by scimore

Hi everyone

The new 3.0 branch is maturing and is now significantly better than 2.5.
You can get it from the download page.

New things are:

  • Bi-directional replication
  • Improved connection string properties - so only connection varies when your application goes from embedded, server to distributed
  • Free-text search improvements
  • Improved handling of connections over non-local networks
  • Dependency linking of DLL's, improving easy of use. Required DLL's get copied around together with ADO.Net Provider.
  • The usual many bug fixes.

We are working on further testing, and hope very soon to add even more exiting news for the 3.0 release. More information will follow.

3.0 is not binary backward compatible with 2.5, so remember to backup your databases, and restore them on the 3.0 version.

If your so lucky you discover a bug, please report it on http://bug.scimore.com.

Thanks,
Scimore Team


Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Testing Replication

May 29, 2008 10:15 by scimore


We just started to test a new Bi-Directional and Read Only replications that will be added in a new database release.

The bi-directional replication resolve conflicts in a column level. For instance, if both subscriber and publisher modify the same row, but different columns, the changes will be merged to a single row. If modified the same column, the conflict will be not resolvable and it will be logged to a conflicts table for the user action. When a conflict occurs, for each conflicting row, 3 rows will be present. 2 rows from subscriber: original row since the last synchronization and the latest one. And 1 the newest from publisher.
If a column has been incremented/decremented by numeric value, for example "Amount" or "NumberOfItems", it could be possible to apply delta changes from subscribers to publisher and avoid conflicts.

The synchronization process is transactional, i.e. if user application aborts replication or unexpectedly exists, the merge changes will be rollback in both subscriber and publisher.

Publisher can synchronize with many subscribers. In addition, subscriber has an option to subscribe to a set of rows from the table with a filter (where clause) condition. Publisher can be distributed DB or Server and subscriber - Server or EmbeddedDB.

We have publishing the latest alpha versions at: http://www.scimore.com/download/release/3.0/ both server and embedded. Try it!

To subscribe the table, execute on subscriber DB following command:

subscribe from 'dev21:999'
(
merge test.testconflicts to test.testconflicts,
merge webstats.geoip to test.geoip
)

Where: 'dev21:999' - publisher, test.testconflicts & webstats.geoip - tables on publisher, test.testconflicts & test.geoip - subscribed tables(these tables will be automatically created).

When you subscribe first time, the command will tell to publisher to initialize tables for publishing and subscribe. The second subscriber will only subscribe to already published table, so it will execute faster.

To unsubscribe, execute in the same format as subscribe command, except change command name to unsubscribe:

unsubscribe from 'dev21:999'
(
merge test.testconflicts to test.testconflicts,
merge webstats.geoip to test.geoip
)

Unsubscribe command will remove information from publisher about the current subscriber and drop the local subscribed tables in subscriber.

To synchronize, execute:

synchronize test.testconflicts with test.testconflicts on 'dev21:999'

OR

synchronize test.geoip with webstats.geoip on 'dev21:999'

The command will merge the changes from both subscriber and publisher.
NOTE: when you execute this command in manager or with .NET provider the conflicts will not be resolved. The command will return an error and abort synchronization.
A conflict appears:
1. UPDATE DELETE - one deleted, other updated.
2. UPDATE UPDATE - both updated the same column. If each updates different columns, sync merges the changes and applies it.

The conflict resolution is possible with the .NET provider new classes for replication.

If you replicate big tables, it may take time when subscribing, you could increment number of DB cache pages to 10.000-100.000 so it will run much faster. The default is 3000 pages where each takes 8KB.

Currently rated 3.5 by 2 people

  • Currently 3.5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags:
Categories:
Actions: E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed

Great support from worldwide user base

April 29, 2008 11:13 by scimore

We now have registered users from 48 countries. People from 124 countries have visited our web site.

The registered users are spread over a wide variety of industries. Financial & Banking sector, Mobile business, Oil industry, Military, Government, Online gaming and Entertainment.

For this we are thankfull. 

We would like to thank everyone who uses our database, posts bugs, gives us challenges, and requests new features.


Currently rated 4.0 by 1 people

  • Currently 4/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

How to use BLOB data types with ScimoreDB

April 18, 2008 11:23 by scimore

The following sample code shows how to use BLOB data types with ScimoreDB.
The full source code of the sample project, which you can compile and execute, is located under http://www.scimore.com/download/2.5/BlobDemo.zip. The sample uses Scimore embedded database.

Suppose that we already have an established connection with a server.

ScimoreConnection connection;
ScimoreCommand command;

The code snippet showing how to insert an image into a BLOB field:

Bitmap image = SystemIcons.Warning.ToBitmap();
byte[] value = (byte[])TypeDescriptor.GetConverter(image).ConvertTo(image, typeof(byte[]));

command.CommandText = "INSERT INTO table_1 VALUES (@p1)";
ScimoreParameter p1 = new ScimoreParameter("p1", ScimoreDbType.Blob);
p1.Value = value;
command.Parameters.Add(p1);

int rowsAffected = command.ExecuteNonQuery();

And the following code snippet shows how to read data from a BLOB field

command.CommandText = "SELECT column_1 FROM table_1";
ScimoreDataReader reader = command.ExecuteReader();
while (reader.Read())
{
    value = (byte[])reader["column_1"];
    image = (Bitmap)TypeDescriptor.GetConverter(typeof(Bitmap)).ConvertFrom(value);
    image.Save("warning.bmp");
}


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , , , , ,
Categories:
Actions: E-mail | Permalink | Comments (7) | Comment RSSRSS comment feed

Documentation updated

March 26, 2008 17:15 by scimore

Documentation has just been updated, and is closed to be finished.

The 2.5 features likes t-sql features, identity, flow control and flush log have been added.

A new minor bug release of the server and manager. We added a feature to the manager, allowing for easier feedback.

And another exciting thing has just happendend yesterday morning. One of the developers reported that he finished the first alpha version of our Visual Studio 2008 integration.


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: ,
Categories:
Actions: E-mail | Permalink | Comments (33) | Comment RSSRSS comment feed