Parallel Programming: Synchronized Objects

Sanden (2011) shows how to use synchronized objects (concurrency in Java), which is a “safe” object, that are protected by locks in critical synchronized methods.  Through Java we can create threads by: (1) extend class Thread or (2) implement the interface Runnable.  The latter defines the code of a thread under a method: void run ( ), and the thread completes its execution when it reaches the end of the method (which is essentially a subroutine in FORTRAN).  Using the former you need the contractors public Thread ( ) and public Thread (Runnable runObject) along with methods like public start ( ).

Additional Examples:

MapReduce

According to Hortonworks (2013), MapReduce’s Process in a high level is: Input -> Map -> Shuffle and Sort -> Reduce -> Output.

Tasks:  Mappers, create and process transactions on a data set filed away in a distributed system and places the wanted data on a map/aggregate with a certain key.  Reducers will know what the key values are, and will take all the values stored in a similar map but in different nodes on a cluster (per the distributed system) from the mapper to reduce the amount of data that is relevant (Hortonworks, 2013). Reducers can work on different keys.

Example: A great example of this a MapReduce: Request, is to look at all CTU graduate students and sum up their current outstanding school loans per degree level.  Thus, the final output from our example would be:

  • Doctoral Students Current Outstanding School Loan Amount
  • Master Students Current Outstanding School Loan Amount.

Now let’s assume that this ran in Hadoop, which can do MapReduce.   Also, let’s assume that I could use 50 nodes (threads) to process this transaction request.  The bad data that gets thrown out in the mapper phase would be the Undergraduate Students, given that it does not match the initial search criteria.  The safe data will be those that are associated with Doctoral and Masters Students.  So, during the mapping phase, the threads will assign Doctoral Students to one key, and Master students would get another key.  Each node (thread) will use the same keys for their respective students, thus the keys are similar in all nodes (threads).  The reducer uses these keys and the safe objects in them, to sum up, all of the current outstanding school loan amounts get processed under the correct group.  Thus, once all nodes (threads) use the reducer part, we will have our two amounts:

  • Doctoral Students Current Outstanding School Loan
  • Masters Students Current Outstanding School Loan

Complexity could be added if we only wanted to look into graduate students that are currently active and non-active service members.  Or they could be complicated by gender, profession, diversity signifiers, we can even map to the current industry.

Resources

Parallel Programming: Threads

A thread is a unit (or sequence of code) that can be executed by a scheduler, essentially a task (Sanden, 2011). A single thread (task) will have one program counter and a sequence of code. Multi-threading occurs when one program counter shares a common code. Thus, the counter in multi-threading has many sequences of code that can be assigned to different processors to run in parallel (simultaneously) to speed up a task. Another way for multi-threading is to have the counter execute the same code on different processors with different inputs. If data is shared between the threads, there is a need for a “safe” object through synchronization, where one thread can access the data stored in a “safe” object at one time. It is through these “safe” objects that a thread can communicate with another thread.

An additional example that may help illustrate the material: 

Maybe we would like to know the average of the sum of all the credits and the average of the sum of all the debits made in personal checking accounts in December in Suntrust Bank. After Map-Reduce techniques using multiple threading, we can go through their entire database system to find accounts and timestamp transactions, map out all the data and reduce it to what we need to return the two numbers in our query. 

Resources: