amen-common
v1.0.5
Published
Amen common func
Downloads
2
Readme
Environment variables
- AMEN_SC_HOSTNAME (ex localhost, without http)
- AMEN_SC_PORT (ex. 3030)
- COUCHDB_URL
- COUCHDB_PASSWORD
- COUCHDB_USERNAME
Data structures
interface RequestCounter{
count:number;
start_at:number;
}
interface Shard{
name:string;
startKey:string;//is not inclusive.
endKey:string;//is inclusive
}
//By inclusive it means (startKey, endKey]. And hence startKey is indeed the endKey of left sibling shard. And hence all value in the shard must be greater than startKey. And the maximum value that a shard can have is endKey.
interface VShard extends Shard{
storage_engine:StorageEngine;
request_counter:RequestCounter;
}
/**
*
* * shard_all_good: Request load is stable and no shard analysis is required.
* * shard_analysis: Does virtual sharding and computes stress distribution. It will be repetitive, and depth level is MAX_SHARD_ANALYSIS_DEPTH(default=1).
*/
enum ShardEngineState{
shard_all_good="shard_all_good",
shard_analysis="shard_analysis"
}
Components
- Shard: A named range of data. Every shard has a name and range of data it upholds.
- StorageEngine: Storage Engine used to store data actual data for the shard.
- VirtualShardEngine: Works on top of storageEngine by creating virtual shards. It create a BTree of
VShard
. And uses a compare function like:
//first key is always search key in BalancedTrees algos
function compare(s1:Shard,s2:Shard){
s2.startKey <= s1.startKey <= s2.endKey -> return 0;
s1.startKey < s2.startKey -> return -1;
s1.startKey >= s2.endKey -> return 1;
}
- VirtualShards: A cell in BTree of
virtual shards
ofVirtualShardEngine
. Each of this cell has link to same underlying storage engine. And once this cell (virtual shard) is selected for service, itsrequest_counter
is increased. - ShardEngine: An underlying storage independent Shard Management/Interaction System. Drives the virtual shard analysis, to give feedback to NodeEngine to distribute shards.
- NodeEngine: Basic working node in AMEN cluster which holds all functionality to verify claims.
- Admin: Admin node which manages Nodes and provide them shard info.
Algorithm for sharding
Presumptions
PRECONDITIONS_FOR_SHARDING: Threshold for analysis is MIN_ANALYSIS_COUNT =5000, MAX_REQUEST_RATE=500; That is as soon as Shard process 5000 request it will start analysis for sharding requirement, provided the request rate is 500 request per second.
For Sharding analysis to begin following condition should match:
this.virtualShardEngine.request_counter.count
should be MIN_ANALYSIS_COUNT + 1;- Request rate must be greater than MAX_REQUEST_RATE. Request rate is calculated as such:
let requestRate = this.virtualShardEngine.request_counter.count/(Date.now()-this.virtualShardEngine.request_counter.at)*1000 ;
Lets assume following
let shard:Shard = {
name: "a123",
startKey:"s1",
endKey:"s10000",
}
let requestRate = 501
VirtualShardEngine
is initialized with following data.
//this set from config received at registration tme from admin
//this.shardConfigReceivedFromAdmin = [{startKey:'s1',endKey:'s10000'];
this.virtualShardEngine = new VirtualShardingEngine(this.shardConfigReceivedFromAdmin);
- ShardEngine is in ShardEngineState.shard_all_good state.
- Subdivisions of virtual shards be , SUB_DIV_COUNT = 10;
- The keys between s1 and s10000 goes as such, s1,s2,s3...,s9999,s10000.
- ShardComparator is defined as such:
let shardComparator =(s1:Shard,s2:Shard)=>{ s2.startKey <= s1.startKey <= s2.endKey -> return 0; s1.startKey < s2.startKey -> return -1; s1.startKey >= s2.endKey -> return 1; }
Algorithm for auto-sharding
- ShardEngine will check if
this.virtualShardEngine.request_counter.count === MIN_ANALYSIS_COUNT+1
.then it will check
calculate_request_rate > MAX_REQUEST_RATE
If
calculate_request_rate <= MAX_REQUEST_RATE
this.analysisDepth = 0; this.state = ShardEngineState.shard_all_good; this.currentVShardEngineConfig=this.shardConfigReceivedFromAdmin;//{startKey:'s1',endKey:'s10000'};//search config received from admin this.virtualShardEngine = new VirtualShardEngine(this.currentVShardEngineConfig);
This call will wipe all internal Btree of VirtualShardEngine. ShardEngine be in
shard_all_good
state.Else if
calculate_request_rate > MAX_REQUEST_RATE
, then check current state:if(this.state === ShardEngineState.shard_analysis){ //lets collect stress distribution let stressDistribution:{startKey,endKey,requestCount}[] = this.virtualShardEngine.getStressDistribution(); //key will be shard name and KeyRange will be {startKey,endKey} type object let shardStrategy = this.createShardStrategy(stressDistribution); this.analysisDepth = 0; this.state = ShardEngineState.shard_all_good; this.currentVShardEngineConfig=this.shardConfigReceivedFromAdmin;//{startKey:'s1',endKey:'s10000'};//search config received from admin this.virtualShardEngine = new VirtualShardEngine(this.currentVShardEngineConfig); if(!shardStrategy){ sendAdminShardStrategyCallBack(shardStrategy); } } else{ this.analysisDepth = 0; this.state = ShardEngineState.shard_analysis; this.currentVShardEngineConfig=this.getFreshSubDividedDistributionConfig(startKey,endKey,SUB_DIV_COUNT); this.virtualShardEngine = new VirtualShardEngine(this.currentVShardEngineConfig); }
it will change ShardEngine state to
shard_analysis
and invokedoVirtualSharding
.
Algorithm for getFreshSubDividedDistributionConfig
- Call
let subDivResult = Storage.getSubdivision(startKey,endKey,SUB_DIV_COUNT)
, which will return a data of type:
{
/**
* Number of elements between start key and end key in the btree.
* */
size: number;
/**
* Average length between keys
**/
avg_distance:number;
/**
* will have SUB_DIV_COUNT+2 keys, with each key separated by avg_distance number of keys in between in the BTree. For example a SUB_DIV_COUNT=1 will give a 3 keys, which can be used to create two virtual shards later.
**/
keys:[startKey,....,endKey],
}
If
subDivResult.size < 2
do nothing, as such a shard is at its full capacity usage. //TODO in future add duplicate shards to increase read capacity of shards return this.shardConfigReceivedFromAdmin;If
subDivResult.size > 2
, than issue :return createVShardConfig(subDivResult); /*[ {startKey: 's1',endKey:'s1000'}, {startKey: 's1000',endKey:'s2000'}, ] */
Algorithm for CreateShardStrategy
enum Howzy{
FIRST_ITSELF_GT_80,
COMING_FROM_20,
COMING_FROM_40
}
createShardStrategy(stressDistribution:{startKey,endKey,requestCount}[]){
let result:{startKey,endKey}[];
let sum=0;
let startKey:Key=stressDistribution[0].startKey;
let endKey:Key =stressDistribution[stressDistribution.length-1].endKey;
let howzy:Howzy = Howzy.FIRST_ITSELF_GT_80;
for(let stress of stressDistribution){
sum+=stress.requestCount;
let totalStressPercentage = 100 * sum/MIN_ANALYSIS_COUNT;
if(totalStressPercentage<=20){
howzy=Howzy.COMING_FROM_20;
let endKey1=stress.endKey;
result.createShards=[
{startKey,endKey1}
];
}
else if( 20<totalStressPercentage<80) {
howzy=Howzy.COMING_FROM_40;
let endKey1=stress.endKey;
result.createShards=[
{startKey,endKey1},
{endKey1,endKey}
];
if(totalStressPercentage>40){
return result;
}
}
else if(totalStressPercentage>=80){
switch(howzy){
case Howzy.FIRST_ITSELF_GT_80:{
//there is no point in distributing this , its used in full capacity.No sharding will be sent.
return
}break;
case Howzy.COMING_FROM_20:{
let currentStressPercentage = 100 * stress.requestCount/MIN_ANALYSIS_COUNT;
if(currentStressPercentage<=60){
//this means
let endKey1=stress.endKey;
//with at most 20/80 distribution;
result.createShards=[
{startKey,endKey1},{endKey1,endKey}
];
return result;
}else{
//there is no point in distributing this , its used in full capacity.No sharding will be sent.
return;
}
}break;
case Howzy.COMING_FROM_40:{
let endKey1=stress.endKey;
//with at least 20/80 and at most 39/61 distribution;
result.createShards=[
{startKey,endKey1},{endKey1,endKey}
];
return result;
}break;
}
}
}
}
Interfaces
interface Key{
id:string;
claim:string;
}
interface KeyRange{
start:Key;
end:Key;
}
class NodeEngine{
protected shards: Record<string,BTreeEngine>;
...
}
class ShardEngine{
public keyExtent: KeyRange;
}
Node registration
- Node sends
node_info
onregister_node
channel. This channel is listened by admin node. Upon listening it will send the shard strategy for the node in registration_response.interface NodeInfo{ node_name:string } interface RegistrationResponse{ //key is shard name, and value is shard range shards:Record<string,KeyRange> }
- Upon receiving this response, Node will create a separate BTree for each Shard info received.
If a shard exist already than the new range will be compared with theclass NodeEngine{ protected shards: Record<string,ShardEngine>; ... }
keyExtent
on existingShardEngine
. if it has changed, than a the old one will be dropped and replaced by a new one as per new configurations.