Replicated, Distributed Key-Value Storage Solution

$30.00

Description

Introduction

For this assignment you will design and implement a simplified version of the Amazon Dy-namo key-value storage system. (The simplifications are such that one would probably not consider it Dynamo any more!) You will have to implement partitioning, replication, and failure handling.

Your goal is to implement a distributed key-value storage system that provides both availability and linearizability. Your implementation must be able to perform successful read and write operations even in the presence of failures.

This document provides an implementation guideline for achieving a successful project. However, you are free to come up with your own design in many parts of the project, provided that you provide both availability and linearizability at the same time in a way that the tester can understand. You must, however, implement partitioning and replication in exactly the manner that Dynamo does (so that the tester can understand your implementation).

This document assumes that you are already familiar with Dyanmo. If you are not, you should first go and read the Dynamo paper (which is required reading for the course) and review the Dynamo slides from class.

This assignment has many similarities to the previous assignment, your Chord ring im-plementation, and you may draw from your Chord implementation (or other previous projects) as appropriate. Please remember to cite any code that is taken from external references, such as Android Developers or the Oracle Java tutorials.

Please read this document in its entirety before you begin. It is long, but that is only because it is complicated and precise. Revisit the instructions regularly as you implement to be sure you’re implementing what is required. There are notes at the end that may assist you if you encounter common problems; make sure to review them as you go, as well.

This assignment will require you to:

  • Implement data replication

  • Implement data partitioning

  • Handle node failures while continuing to provide availability and linearizability

  • Getting Started

We are providing a project template for you to get started. As in the previous assignments, the template configures a number of things on your behalf. It is important that you do not

  • This assignment is based, with permission, on an assignment developed and used by Steve Ko in his version of CSE 486/586. Large portions of text are used verbatim from that assignment.

1

change settings that you do not have to change to complete your project.

You will need to download the project from GitHub Classroom and open it in Android Studio. You should have received the GitHub Classroom invitation for this project via Piazza.

Instructions for cloning the project and opening it in Android Studio are the same as before, except that the base repository name is SimpleDynamo. You can find YouTube tutorials detailing the process here.

2 The Content Provider

The graded portion of this project is a ContentProvider providing simplified Dynamo func-tionality to store and retrieve data across multiple devices even under failures. The pro-vided template uses the package name edu.buffalo.cse.cse486586.simpledynamo, and defines a content provider authority and class. Please do not change the package name, and use the content provider authority and class for your implementation.

Like the SimpleDht project, all functionality for the requirements for this project should be provided by your content provider. It should handle availability in the face of failures via replication, as well as linearizability, independent of any Activity you may implement. The Activity in this project is used for your testing purposes only.

You will use SHA-1 as the hash function to generate IDs for the system, the same as with our Chord implementation. You should use the genHash() function from Project 3 for this purpose. For this project, you will need to run five emulators. We will be using the multi-port configuration as described in project 1. Your app will open one server socket that listens on port 10000, but it will connect to a different port number on the IP address 10.0.2.2 for each emulator, as follows:

emulator serial

port

emulator-5554

11108

emulator-5556

11112

emulator-5558

11116

emulator-5560

11120

emulator-5562

11124

2.1 Implementing the Content Provider

In implementing your content provider, you may make the following assumptions:

  1. There will always be 5 nodes in the system (except for failures). There is no need to implement joining or departure of nodes. Unlike the previous project, your nodes may hard-code the ring structure, as long as they handle failure.

  1. There will be at most one failure at a time. We will emulate a failure by force-closing an app on one emulator. We will not close the emulator itself.

  1. All failures are temporary. You can assume that a failed node will recover soon. It will not be permanently unavailable during a run.

2

  1. Correctness is more important than performance! Once you handle failures correctly, if you have time, you can improve performance.

  1. You do not need to implement virtual nodes. (These are part of the Dynamo protocol, see the Dynamo paper!) All partitions are static and fixed.

  1. You do not need to handle hinted handoff. This means that, during a failure, it is OK to replicate on only two nodes (if the third replica would be on a failed node). (Again, see the Dynamo paper.)

  1. Linearizability will be checked only on a per-key basis. You do not need to provide any consistency guarantees between keys. Formally, you must implement per-key lineariz-ability.

2.1.1 Requirements

1. The URI of your content provider must be:

content://edu.buffalo.cse.cse486586.simpledynamo.provider

  1. The content provider should implement the Dynamo-like functionality described here. Like the previous assignment, this includes all communication and functionality for the replicated store, including creation of appropriate tasks and sockets, response to requests from other AVDs, etc. No functionality for the Dynamo store should be pro-vided by your Activity, which is used only for your own testing.

  1. When a node recovers from failure, it should copy all of the object writes that it missed during the failure. This can be done by making requests to the appropriate nodes and copying their data.

  1. Your content provider must support concurrent read and write operations.

  1. Your content provider must handle a failure occurring during read and write opera-tions.

  1. Replication in your content provider should be handled just as it is in Dynamo. Each key-value pair should be replicated over three consecutive partitions, starting from the partition that the key belongs to (based on its consistent hashing).

  1. All replicas must store the same value for each key. This is per-key linearizability.

  1. Each content provider instance should have a node ID derived from its emulator serial

number. This node ID must be obtained by applying the specified hash function (named genHash(), above) to the emulator serial number. For example, the node ID of the con-tent provider running on emulator-5554 should be node_id = genHash(“5554”). The ID string you should use is exactly the string found in portStr in the telephony hack. This is necessary to place each node in the correct place in the Dynamo ring.

3

  1. Any app using your content provider can submit arbitrary key-value pairs of Java strings (e.g., <“I want to”, “store this”>), query arbitrary keys and retrieve their values, or delete arbitrary keys.

  1. As in the SimpleDht assignment, your app must support query operations for the spe-cial keys “@” and “*”, which are defined as before.

  1. As in the previous assignments, your content provider should have two columns.

    • The first column should be named “key” (all lowercase, no quotation marks). This column is used to store keys.

    • The second column should be named “value” (all lowercase, no quotation marks). This column is used to store the values associated with keys.

    • All keys and values stored by your provider should be Java strings.

  1. Any app should be able to access (both read and write) your content provider. As in previous assignments, please do not require any permissions to access your content provider.

Note that each content provider must store only the key-value pairs local to its own partition of the ID space, and key-value pairs it is required to replicate from other partitions!

2.2 Writing the main activity

The provided template code includes an activity to be used for your own testing and debug-ging. It will not be specifically graded or evaluated in any way. You can (and should) use it to evaluate your Dynamo implementation as you see fit.

2.3 Implementation Guidelines

The following are some guidelines for implementing your content provider. Remember that you are free to implement your project however you wish, as long is it meets the project requirements! These guidelines are provided only to assist you in designing a successful implementation. In particular, many parts of these guidelines can be improved upon using techniques from the lectures or literature, if time allows. Remember that your partition-ing and replication behaviors must be exactly as specified above, whatever techniques you choose.

Membership. Just as in the original Dynamo paper, every node can know about every other node, including its location and partition of the ring. This means that any node can forward a request that it cannot handle locally directly to a node that can without using ring-based routing.

Request routing. When there are no failures, a request can be forwarded directly to the co-ordinator for the key (i.e., the key’s successor), and the coordinator should handle read/write operations for that key.

4

Quorum replication. In order to implement linearizability, you can implement quorum-based replication. Note that the original design does not handle linearizability, and you will need to adapt it. If you implement quorum replication, the replication degree (N) should be 3, and the reader and writer quorum sizes (R and W) should be 2. This means that, given a key and no failures, the key’s coordinator as well as the next two successor nodes in the ring should store the key and its value, and that the coordinator should, for every request for a get or put, contact two other nodes and get a vote from each. You can version objects in order to distinguish stale objects from the most recent copy (e.g., following failure and re-joining of a node). Upon read, if more than one version of an object is returned, the coordinator should use the most recent version.

Chain replication. Another alternative for implementing linearizability is chain replica-tion. The details for this can be found in Chain Replication for Supporting High Throughput and Availability, by Robbert van Renesse and Fred B. Schneider, found at http://www.cs.cornell. edu/home/rvr/papers/OSDI04.pdf. In chain replication, a write is always made to the first par-tition, and then propagates to the next two partitions in order. The last partition returns the result of the write. Read operations are always made to the last of the chain of partitions.

Failure handling. Failures must be handled very carefully, as there may be many corner cases to consider and cover. Just as described in the Dynamo paper, all messages between nodes can be used to detect failures. You can use a timeout on socket operations (in par-ticular, read operations), just as we did in the GroupMessenger2 project for this, using a reasonable timeout period (e.g., 100 ms) and considering a node failed if it does not respond before the timeout. Just as in the previous project, do not rely on socket creation or connection status to detect failure, as our emulated environment may distort this signal. When the co-ordinator for a request fails and it therefore does not respond to a request, its successor can be contacted next to fulfill the request. Remember that you can assume at most one failure at a time.

2.4 Testing

We have testing programs to help you see how your code does with our grading criteria. If you find rough edges in the testing programs, please report it so the teaching staff can fix it.

You should also test your program yourself by writing your own tests and drivers.

The testing program for this project is divided into six phases. The first three phases will not take much time compared to the second three, so you should focus on them first and finish them as quickly as possible. This will allow you to spend as much time as needed on the last three phases.

The phases are:

  1. Testing basic operations

    • This phase will test insert, query, and delete (including the special keys @ and *). This will ensure that all data is correctly replicated. It will not perform any con-current operations or test failures

5

  1. Testing concurrent operations with differing keys

    • This phase will test your implementation under concurrent operations, but with-out failure.

    • The tester will use different key-value pairs inserted and queried concurrently on various nodes.

  1. Testing concurrent operations with like keys

    • This phase will test your implementation under concurrent operations with the same keys, but no failure.

    • The tester will use the same set of key-value pairs inserted and queried concur-rently on all nodes.

  1. Testing one failure

    • This phase will test every operation under a single failure.

    • For each operation, one node will crash before the operation starts. After the operation is done, the node will recover.

    • This will be repeated for each operation.

  1. Testing concurrent operations with one failure

    • This phase will execute various operations concurrently and crash one node in the middle of execution. After some time, the node will recover (also during exe-cution).

  1. Testing concurrent operations with repeated failures

    • This phase will crash one node at a time, repeatedly. That is, one node will crash and then recover during execution, and then after its recovery another node will crash and then recover, etc.

    • There will be a brief period of time between each crash-recover sequence.

Each testing phase is quite intensive, and will take quite some time to finish. Therefore, the tester allows you to specify which phase you want to test on the command line so that you need not wait for all tests every time. However, you will still need to ensure that you run the tester in its entirety before you submit. We will not run the testing phases separately in our grading.

  • You can specify which phase of the test you wish to run with the -p or –phase argument to the tester.

  • You can see the available arguments for the tester with the -h option.

6

Note that if you run an individual phase with -p or –phase, the tester will always perform a fresh install. However, when running all phases, the tester will perform a fresh install only before phases 1 and 2. This means that from phase 2 onward, all data from previous phases will remain intact.

The instructions for using the testing programs are the following:

  • Download a testing program for your platform. If your platform does not run it, please report it.

    1. Windows: Tested on 64-bit Windows 8.

    1. Linux: Tested on 64-bit Debian 9 and 64-bit Ubuntu 17.10 (see below for important information about 64-bit systems).

    1. Mac OS: Tested on 64-bit Mac OS 10.9 Mavericks.

  • Before you run the program, please make sure that you are running all five AVDs. You can use python run_avd.py 5 to start them.

  • Remember to start the emulator network by running set_redir.py 10000.

  • Like the previous testing program, this test will require you to provide your APK as a command line argument, and will take care of installing and uninstalling it on the emulators.

  • Run the testing program from the command line.

  • It may issue some warnings or errors during execution. Some of them are normal, some may indicate errors in your program. Examine them to find out!

  • The testing program will give you partial and final scores.

Remember during your own testing that you may see strange contents in your content provider if you do not uninstall your application between tests. In particular, updating the app from Android Studio does not uninstall the existing app first, so the content provider’s state will not be cleared. You can uninstall your app by hand with the following command:

adb -s < emulator > uninstall edu . buffalo . cse . cse486586 . s i m p l e d h t

Notes for 64-bit Linux: The testing program is compiled 32-bit. If you get an error like the following, or the shell reports command not found when you run the executable, install the

32-bit libz for your system:

./simpledht-grading.linux: error while loading shared libraries:

libz.so.1: cannot open shared object file: No such file or directory

On Debian-based distributions, you can accomplish this with the command apt-get install zlib1g:i386 as root (you may need to use sudo or su). If apt-get reports an error about the architecture or says the package is not found, you may need to enable multiarch. To do this, run dpkg –add-architecture i386 as root, then update your APT repositories with apt-get update as root. Once this is done, you should be able to install the 32-bit libz.

For other distributions you will need to consult your distribution documentation.

7

  • Submission

We use UB CSE autograder for submission. You can find autograder at https://autograder.

cse.buffalo.edu/, and log in with your UBITName and password.

Once again, it is critical that you follow everything below exactly. Failure to do so will lead to no credit for this assignment.

Zip up your entire Android Studio project source tree in a single zip file. Ensure that all of the following are true:

  1. You did not create your zip file from inside the GroupMessenger1 directory.

  1. The top-level directory in your zip file is SimpleDynamo or SimpleDynamo-<something>, and it contains build.gradle and all of your sources.

  1. You used a zip utility and not any other compression or archive tool: this means no 7-Zip, no RAR, no tar, etc.

4 Deadline

This project is due 2018-05-11 11:59:00 AM. This is one hour before our class. This is a firm deadline. If the timestamp on your submission is 11:59:01, it is a late submission. You are expected to attend class on this day!

Note that, like the previous project, this deadline has been pre-extended, and it will not be extended any further. This deadline cannot be extended due to the end of the semester!

5 Grading

This assignment is 20% of your final grade. Credit for this assignment will be apportioned as follows:

  • Phase 1: 3%

  • Phase 2: 4%

  • Phase 3: 3%

  • Phase 4: 5%

  • Phase 5: 5%

  • Phase 6: 3%

Thus, an application passing all six phases would receive 23 total points, representing 20% of the course grade as full credit for the project plus 3% of “extra credit”.

8

Notes

  • Please do not use a separate timer to handle failures, as this will make debugging very difficult and is likely to introduce race conditions. Use socket timeouts and handle all possible exceptions that may be thrown when there is a failure. They are:

SocketTimeoutException, StreamCorruptedException, EOFException, and the parent IOException.

  • Please reuse your TCP connections with a single remote host, instead of creating a new socket every time you send a message. You can use the same socket for both sending to and receiving from a remote AVD. Think about how you will coordinate this.

  • Please do not use Java object serialization (i.e., implementing Serializable). This will cause very large messages to be sent and received, with unnecessarily large message overhead.

  • Please do not assume that a fixed number of DHT operations will be invoked on your system. Your implementation should not hardcode the number or types of operations in any way.

  • Remember that there is a cap on the number of AsyncTasks that you can execute si-multaneously. Plan accordingly, to keep the total number of threads that you require under this limit (which is about five).

  • If the testing program cannot install your APK on one or more emulators, try each of the following steps:

    1. Restart the problematic emulator.

    1. Clean your build in Android Studio, then invoke the “Build APK” menu item and try again.

    1. Manually install your APK using adb install to ensure that it is installable.