A scalable database for persistent, binary byte trees with shared nodes
$10000-19990 USD
Fechado
Publicado há quase 10 anos
$10000-19990 USD
Pago na entrega
Your job is to create a database for a persistent, binary tree structure which will hold byte values. That database will be used to store a huge library of programs created with our platform, and is expected to grow to big sizes (terabytes), so it must be scalable. You must implement a smart compression algorithm that will recognise identical nodes and eliminate them. So, instead of storing trees as strings, it will store them as a flat sequence of nodes of values (bytes) and pointers. Notice the example below:
ORIGINAL TREE: (((1 (2 (1 (2 0)))) (7 (1 (2 (1 (2 0)))))) ((1 (2 (1 (2 0)))) ((1 (2 (1 (2 0)))) ((1 (2 (1 (2 0)))) (5 (1 (2 (1 (2 0))))))))) (125 bytes)
COMPRESSED TREE: (2 0) (1 *0) (2 *1) (1 *2) (5 *3) (*3 *4) (*3 *5) (1 *6) (16 bytes)
On the compressed tree, pointers are tagged with a * . To recover the original tree, all you have to do is to dereference the pointers. For example, starting with the node (3 5), you can expand it as follows:
(*3 *5)
((1 *2) (*3 *4))
((1 (2 *1)) ((1 *2) (5 *3)))
((1 (2 (1 *0))) ((1 (2 *1)) (5 (1 *2))))
... and so on, until the original tree is recovered.
The use of that particular strategy is mandatory. Other than that, you are free to design whatever scalability techniques and strategies you like.
The final product must expose an API that will allow you to:
Save a tree with a name. Example:
db.save_tree("program1",[1,[[3,1],[1,2]]])
Get a tree with a specific name. Example:
db.get_tree("program1") = [1,[[3,1],[1,2]]]
Serialise a tree to a binary format. For example, the tree used above is serialised as follows:
serialize_binary(tree) = "100010100000100001000000000001000100000000000110000100000000010100101000000000110000000001100000000100000000000110000000010110000100000000110" // (17 bytes)
Pack the binary serialisation of the string into a UTF16 string. That format will be used to efficient exchange of data between the server and client via Ajax. For example:
pack_utf16(serialize_binary(tree)) = "七匄嘀瀀娠傔习儀渃丬嘆" // (22 bytes)
Unserialise and unpack the representations above.
You are free to add other operations if it proves necessary (for example, update_node(tree,path)), but that is optional.
There is no preference in how the API is exposed.
The final product will be put to stress tests before acceptance. The tests specifications as well as tools to perform those will be provided, so you can execute them by himself before submitting the work.
You are free to chose whatever programming language or technologies you want, including existing technologies such as other databases - as long as you get the desired results.
The payment can be done via paypal, bank transfer or bitcoins (preferred). Feel free to contact me for more information. Please submit your proposal together with an official note of an approximated cost, as well as examples of previous works on the area.
Hello,
Your project looks very interesting and I would like to participate in it. For the best performance I'd suggest to use C++ for the development.
I have a question regarding the stress tests. Could you please provide more information regarding the expected limitations and performance? E.g. how big could the binary trees be? As long as you want to be able to convert them into UTF16 strings to transfer them through Ajax they shouldn't be too big, at least not gigabytes, am I right? How fast saving/retreiving of trees should work (without taking into account HDD limitations) in seconds?
Must all the trees be packed into a single file or is it acceptable to save every single tree into a separate file?
What platform do you expect this API to work on? Windows, Linux, MAC, anything else?
I look forward to hear back from you.
--
Thank you,
Roman
Hi,
My name is Mike. You have the interesting project and I will be glad to help you.
Please contact me to discuss the project details at your convenience.
Best regards,
Mike
Hi ,
We, Veltrod software services are a software consulting company specialized in providing Mobile, ECommerce and Social media frameworks using cutting edge and emerging technology. Leveraging best-in-class people, processes, and technologies, Veltrod provides high-quality software development and consulting services to independent software vendors and enterprises with WOW factor.
We are specialized in providing solutions on the below mentioned areas.
1. Mobile application development (iPhone, Android, BB, Windows Mobile)
2. ECommerce Solutions
3. Windows application development
4. Web application development (Open Source,.NET, JAVA)
5. Cloud based solution
6. Image Editing
7. Games development
8. Independent testing
If this project is offered to us, then we can allocate a dedicated team of Project manager, Graphic designer, developers & testers and provide the high quality services to the lowest cost.
Thanks
Santhosh
Hi,
We have gone through your requirement and would like to take up this task and provide qulatity result in minimal time.
Hoping to hear from you soon.
Techno Verstand
We are currently working on a binary tree database using C language but without GUI...we can share with you the code if you want.
Hello, that's nice project and I would like to participate on it. I teach C/C++ and Python on university and I have experience with this type of projects. I look forward for great cooperation.
Hello,
your problem is a very interesting one.
However, I'm wondering to know the requirements for memory and cpu footprint (probably the lesser is the better).
For the implementation part, I should follow the following strategy:
- Create a representation of the tree in a maneageable memory structure (if it doesn't yet exists)
- Calculate a summary value for each node element, computed considering its child nodes (values and configuration)
- Put the summary value of each node as a key in a dictionary element considering:
A) To avoid inserting anything if it already exists
B) Place as value of the dictionary entry the summary value of the child nodes (if valued-nodes. i.e. leaves, it is empty)
- Construct the output flat structure from the the dictionary table values, starting from the fully valued nodes (entries with both children empty) then the half-valued, and moving up.
I guess that the complexity of this solution is related to the value of the nodes (N) in a order of N*Log(N) (due the indexed seek in the internal lookup table of the dictionary).
About the storage of the compressed tree on the filesystem I guess that a normal filesystem with a simple layout can be enough (In the past, I've developed an EDM system capable of handling millions of documents and it is currently adequate).
I can write the API in any language i know as you prefer: C, C++, C# or Java
Best Regards,
Daniele Faggi
Hi,
We are a Software development Company based in India with the Manpower of more then 35 Developers with an Average Experience of more then 5 Years.
The Goal is to provide the Most Efficient & Quality Services at a Minimum Cost of development to Increase the ROI for our Clients.
We have specialized in Following languages
PHP
.Net
SQL
Wordpress
Magento
MySQL
Astrisk
OS Platforms:
Windows 7 (X64, X86)
Windows XP
Cent Os 5.0 & Above
Other Linux Systems.
Mobile Application:
Symbian S40
Symbian S60
I-Phone
Android 2.1 & Above
We have been Associated with Brands Like IIM, Reliance , Indian Railway ,Online Travel Companies, Betting Exchange,
Odds Comparation Portals.
We work on Various as-pacts of Commercials such as Hourly Basis , Project Basis , Resource Basis.
In Resource & Hourly basis we offer Services like Time Tracking, Remote Monitoring , CCTV Monitoring for all our clients to
verify the Resource Availability.
We Assure the 100% Result oriented Architecture for your Business.
Looking for a Long term Relationship.
India