Jiayin Wang is currently a tenure-track Assistant Professor in Computer Science Department at Montclair State University.
She received her Ph.D. in Spring 2017 from the Computer Science Department at University of Massachusetts Boston advised by Dr. Bo Sheng. She received her Bachelor degree in Electrical Engineering from Xidian University, China in 2005.
Her research interests include big data, cloud computing, mobile computing, and wireless networks. She is working on resource management scheduling and performance evaluation in big data computing systems.
My research focus is on the resource management scheduling and performance evaluation in the large scale platform for big data processing. Now I’m working on MapReduce/Hadoop. MapReduce is a very popular big-data processing framework and and Hadoop is an open-source implementation of MapReduce. Basically, with a batch of MapReduce jobs, my aim is to reduce the execution time of all jobs and improve the resource utilities during the job execution.
The scheduling problem in MapReduce is different from the traditional job scheduling problem as the reduce phase usually starts before the map phase is finished to “shuffle” the intermediate data. This paper develops a new strategy, named OMO, which particularly aims to optimize the overlap between the map and reduce phases. Our solution includes two new techniques, lazy start of reduce tasks and batch finish of map tasks, which catch the characteristics of the overlap in a MapReduce process and achieve a good alignment of the two phases. We have implemented OMO on Hadoop system and evaluated the performance with extensive experiments. The results show that OMO's performance is superior in terms of total completion length (i.e., makespan) of a batch of jobs.
The Hadoop ecosystem has evolved into its second generation, Hadoop YARN, which adopts fine grained resource management schemes for job scheduling. In YARN, there is no “slot” which is the building block in the old versions, and the system no longer distinguishes map and reduce tasks when allocating resources. Instead, each task specifies a resource request in the form of <2G,1core> (i.e., requesting 2G memory and 1 cpu core), and it will be assigned to a node with sufficient capacity. However, existing schedulers in YARN don't consider the efficiency of resource utilization for multiple jobs running concurrently in cluster.
Motivated by above problem, We designed a YARN scheduler, named HaSTE, which can effectively reduce the makespan of MapReduce jobs in YARN platform by leveraging the information of requested resources, resource capacities, and dependency between tasks. Moreover, we proposed an opportunistic scheduling scheme to reassign reserved but idle resources to other waiting tasks. The major goal of our new scheme is to improve system resource utilization without incurring severe resource contentions due to resource over provisioning.
The native Hadoop only allows static slot configuration, i.e., fixed numbers of map slots and reduce slots throughout the lifetime of a cluster. However, we found that such a static configuration may lead to low system resource utilizations as well as long completion length. Motivated by this, we developed a fair and efficient slot configuration and scheduling for Hadoop clusters called FRESH which can derive the best slot setting, dynamically configure slots, and appropriately assign tasks to the available slots so that it cannot only improve the makespan but also guarantee the fairness of batch jobs.
Smartphones play an important role in mobile social networks. This paper presents a Mobile Message Board (MMB) system for smartphone users to post and share messages in a cer- tain area. Our system is built upon ad-hoc communication model, and allows the users to browse the nearby information without pre-registration with any servers. Our algorithm design focuses on the message management on each phone considering its own schedule of turning the wireless device on and off. We present algorithms for two different cases to maximize the availability of the messages. Furthermore, we have implemented our solutions on commercial smartphones, and conducted experiments and simulation for evaluation. The results are supportive and shows that the MMB system is efficient and effective for location-based message dissemination.
MapReduce has become a popular data processing framework in the past few years. Scheduling algorithm is crucial to the performance of a MapReduce cluster, especially when the cluster is concurrently executing a batch of MapReduce jobs. However, the scheduling problem in MapReduce is different from the traditional job scheduling problem as the reduce phase usually starts before the map phase is finished to “shuffle” the intermediate data. This paper develops a new strategy, named OMO, which particularly aims to optimize the overlap between the map and reduce phases. Our solution includes two new techniques, lazy start of reduce tasks and batch finish of map tasks, which catch the characteristics of the overlap in a MapReduce process and achieve a good alignment of the two phases. We have implemented OMO on Hadoop system and evaluated the performance with extensive experiments. The results show that OMO's performance is superior in terms of total completion length (i.e., makespan) of a batch of jobs.
This paper investigates the routing protocols in smartphone-based mobile Ad-Hoc networks. We introduce a new dual radio communication model, where a long-range, low cost, and low rate radio is integrated into smartphones to assist regular radio interfaces such as WiFi and Bluetooth. We propose to use the long-range radio to carry out small management data packets to improve the routing protocols. Specifically, we develop new schemes to improve the efficiency of the path establishment and path recovery process in the on-demand Ad-Hoc routing protocols. We have prototyped our solution LAAR on Android phones and evaluated the performance with small scale experiments and large scale simulation implemented on NS2. The results show that LAAR significantly improves the performance.
Hadoop YARN is an open project developed by the Apache Software Foundation to provide a resource management framework for large scale parallel data processing. However, there exists a resource waiting deadlock under the Fair scheduler when the resource requisition of applications is beyond the amount that the cluster can provide. In such a case, the YARN system will be halted if all resources are occupied by ApplicationMasters, a special task of each job that negotiates resources for processing tasks and coordinates job execution. Therefore, we develop a new admission control mechanism which dynamically reserves resources for processing tasks in order to avoid resource waiting deadlocks and meanwhile obtain good performance. We implement and evaluate our new mechanism in Hadoop YARN v2.2.0. The experimental results show the effectiveness of this mechanism under MapReduce benchmarks.
Although a substantial amount of research has examined the constructs of warmth and competence, far less has examined how these constructs develop and what benefits may accrue when warmth and competence are cultivated. Yet there are positive consequences, both emotional and behavioral, that are likely to occur when brands hold perceptions of both. In this paper, we shed light on when and how warmth and competence are jointly promoted in brands, and why these reputations matter.
This paper investigates the routing protocols in smartphone-based mobile Ad-Hoc networks. We introduce a new dual radio communication model, where a long-range, low cost, and low rate radio is integrated into smartphones to assist regular radio interfaces such as WiFi and Bluetooth. We propose to use the long-range radio to carry out small management data packets to improve the routing protocols. Specifically, we develop new schemes to improve the efficiency of the path establishment and path recovery process in the on-demand Ad-Hoc routing protocols. We have prototyped our solution LAAR on Android phones and evaluated the performance with small scale experiments and large scale simulation implemented on NS2. The results show that LAAR significantly improves the performance.
This paper studies the video buffer control for streaming video data to mobile devices. We target on the design challenge when the wireless link quality is dynamic due to the the environmental factors or user mobility. We develop a Dynamic and Agile buffor-control scheme, called DAB, that adaptively adjust the video buffer size based on the measurements of the signal strength (RSSI) and accelerometer on the smartphone. Our goal is to keep a smooth playback while delivery as little data as possible to the end-user in order to save bandwidth cost. We have implemented our solution on Android platform and evaluate it with experiments. Compared to the traditional video buffer scheme, our solution DAB significantly improves the performance in terms of the quality of playback and the buffer efficency.
Smartphones have become more and more popular in the past few years. Motivated by the fact that location plays an extremely important role in mobile applications, this paper develops an efficient local message dissemination system PASA based on a new communication model called passive broadcast. It is based on the method of overloading device names described in MDSRoB [14] and Bluejacking [23]. In this new model, each node does not maintain connection state and data delivery is initialized by a receiver via a `scan' operation. The representative carriers of passive broadcast include Bluetooth and WiFi-Direct, both of which define a mandatary `peer discovery' scan function. Passive broadcast features negligible cost for establishing and maintaining direct links and is extremely suitable for short message dissemination in the proximity. In this paper, we present PASA with complete protocols and in-depth analysis for optimization. We have prototyped our solution on commercial phones and evaluated it with comprehensive experiments and simulation.
Hadoop is an emerging framework for parallel big data processing. While becoming popular, Hadoop is too complex for regular users to fully understand all the system parameters and tune them appropriately. Especially when processing a batch of jobs, default Hadoop setting may cause inefficient resource utilization and unnecessarily prolong the execution time. This paper considers an extremely important setting of slot configuration which by default is fixed and static. We proposed an enhanced Hadoop system called FRESH which can derive the best slot setting, dynamically configure slots, and appropriately assign tasks to the available slots. The experimental results show that when serving a batch of MapReduce jobs, FRESH significantly improves the makespan as well as the fairness among jobs.
The MapReduce framework has become the de facto scheme for scalable semi-structured and un-structured data processing in recent years. The Hadoop ecosystem has evolved into its second generation, Hadoop YARN, which adopts fine-grained resource management schemes for job scheduling. One of the primary performance concerns in YARN is how to minimize the total completion length, i.e., makespan, of a set of MapReduce jobs. However, the precedence constraint or fairness constraint in current widely used scheduling policies in YARN, such as FIFO and Fair, can both lead to inefficient resource allocation in the Hadoop YARN cluster. They also omit the dependency between tasks which is crucial for the efficiency of resource utilization. We thus propose a new YARN scheduler, named HaSTE, which can effectively reduce the makespan of MapReduce jobs in YARN by leveraging the information of requested resources, resource capacities, and dependency between tasks. We implemented HaSTE as a pluggable scheduler in the most recent version of Hadoop YARN, and evaluated it with classic MapReduce benchmarks. The experimental results demonstrate that our YARN scheduler effectively reduces the makespans and improves resource utilization compare to the current scheduling policies.
This paper targets the application of cloud storage management for mobile devices. Because of the limit of bandwidth and other resources, most existing cloud storage apps for smartphones do not keep local copies of files. This efficient design, however, limits the application capacities. In this paper, our goal is to extend the available file operations for cloud storage service to better serve smartphone users. We develop Skyfiles, an efficient and secure file management system that supports more advance file operations. Our basic idea is to utilize cloud instances to assist file operations. Particularly, Skyfiles supports download, compress, encrypt, convert operations, and file transfer between two smartphone users' cloud storage spaces. In addition, we design protocol for users to share their idle instances.
The MapReduce framework and its open source implementation Hadoop have become the defacto platform for scalable analysis on large data sets in recent years. One of the primary concerns in Hadoop is how to minimize the completion length (i.e., makespan) of a set of MapReduce jobs. The current Hadoop only allows static slot configuration, i.e., fixed numbers of map slots and reduce slots throughout the lifetime of a cluster. However, we found that such a static configuration may lead to low system resource utilizations as well as long completion length. Motivated by this, we propose a simple yet effective scheme which uses slot ratio between map and reduce tasks as a tunable knob for reducing the makespan of a given set. By leveraging the workload information of recently completed jobs, our scheme dynamically allocates resources (or slots) to map and reduce tasks. We implemented the presented scheme in Hadoop V0.20.2 and evaluated it with representative MapReduce benchmarks at Amazon EC2. The experimental results demonstrate the effectiveness and robustness of our scheme under both simple workloads and more complex mixed workloads.
Lecturer: Deliver two presentations weekly, 75mins each; CS/IT 114 is the first course in the two-course version of introductory Java programming.
Teaching Assistant
Duties include assisting the instructor to develop the exercises of the course, grading homework and holding the Q&A sessions through office hours.
Lab sessions Instructor: in charge of the lab sessions of CS110 for 40 students each semester.
Deliver two presentations weekly, 25mins each; follow by 50mins hands-on guidance to help students fully understand the lectures; and guide them to finish the lab project with programming in Java.
Teaching Assistant
Duties include giving lectures to introduce SQL, grading homework and projects, and holding the Q&A sessions through office hours.
Teaching Assistant
Duties include assisting students with programming in assembly language, grading homework and projects, and holding the Q&A sessions through office hours.
Teaching Assistant
Duties include assisting the instructor to develop the exercises of the course, grading homework and holding the Q&A sessions through office hours.
Office: RI-312
Tel : 973-655-4230
Email: jiayin.wang AT montclair.edu