I am a PhD student in the Electrical & Computer Engineering (ECE) Department at Carnegie Mellon University. My research interests are in Computer Security, particularly in software testing and software verification. My advisor is Prof. Priya Narasimhan.
Previously, I completed my Fifth-year Master's in Computer Science program at Carnegie Mellon University in August 2009. Before this, I completed my Bachelor's of Science in Computer Science at Carnegie Mellon, with an additional major in Economics. Previously, I worked on failure diagnosis for distributed systems, and I contributed to the Hadoop Chukwa sub-project where some of my work on Hadoop log analysis and visualization was implemented.
Characterization, Visualization and Diagnosis of MapReduce Systems (Jun '07 - Aug '12)
We applied our work on problem diagnosis in distributed systems specifically to MapReduce systems, focusing on white-box approaches to capture the behavior of MapReduce systems for the purposes of characterizing and visualizing the behavior of MapReduce systems to aid diagnosis. We build state-machine based models of behavior of MapReduce systems from Hadoop's system logs (WASL '08) and present visualizations of this behavior to aid operators in automated diagnosis of failures and performance debugging and optimization of MapReduce programs (HotCloud '09).
Adaptive Control in Distributed Systems (Jun '08 - Dec '08, Intel Research Pittsburgh)
With the advent of cheap multi-core processors and commodotized server hardware platforms, large clusters of computer systems have become common. It is useful to be able to distribute large, arbitrary pieces of computation across multiple servers in a hosted environment. However, these environments are shared and often have interference from other user jobs, and (hardware) failures are common.
We are interested in monitoring such systems of large computations distributed across multiple servers to dynamically identify and cope with such problems arising from interference by other users, hardware failures, and to deal with workload changes. This monitoring would then enable us to perform intelligent runtime adaptation.Project link: SLIPstream
Problem Diagnosis in Distributed Systems (Jun '07 - Present)
Distributed systems have become ubiquitous in today's Internet-enabled world. With the growing scale of distributed systems, problems in these increasingly large systems have also become increasingly difficult to detect. We focus on two aspects of problem diagnosis: fault localization--identifying nodes in a distrbuted system which failed, and root-cause diagnosis, which seeks to find the cause of the failure.
We are interested in using available information that can be readily extracted from running systems in production environments that do not require invasive code modification nor instrumentation, and which do not impose expensive performance penalties and overheads on the operation of the system.Project link: Fingerpointing
Dynamic Upgrades in Distributed Systems (Jan '07 - Jun '07)
Software upgrades are a significant source of planned downtimes in enterprise distributed systems. Many software upgrades overrun time and cost budgets, and even leave software systems in unusable states after the upgrade. A key contributing factor to the complication of upgrades is the complexity of the dependencies amongst multiple components in a large distributed system.
The goal of this work is to devise a method of carrying out an upgrade of a software system from one software to a different software system with possibly completely different semantics in a dependency-agnostic fashion, while keeping both the old and the new systems operating concurrently. Our aim was to minimize the downtime and dependency-induced mistakes associated with software upgrades.
U. Drolia, R. Martins, J. Tan, A. Chheda, M. Sanghavi, R. Gandhi, and P. Narasimhan. The Case For Mobile Edge-Clouds. In 10th IEEE International Conference on Ubiquitous Intelligence and Computing (UIC 2013), December 2013.
C. Wang, S. Kavulya, J. Tan, L. Hu, M. Kutare, M. Kasick, K. Schwan, P. Narasimhan, and R. Gandhi. Performance Troubleshooting in Data Centers: An Annotated Bibliography. In ACM SIGOPS Operating Systems Review 47(3), November 2013.
E. Garduno, S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan. Theia: Visual Signatures for Problem Diagnosis in Large Hadoop Clusters. In USENIX ;login, 38(2), April 2013.
E. Garduno, S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan. Theia: Visual Signatures for Problem Diagnosis in Large Hadoop Clusters. In 26th Large Installation System Administration (LISA) Conference, San Diego, CA, Dec 2012. Awarded Best Student Paper.
J. Tan, S. Kavulya, R. Gandhi, P. Narasimhan. Lightweight Black-box Failure Detection for Distributed Systems. In Workshop on Management of Big Data systems (MBDS) 2012, co-located with the International Conference on Autonomic Computing, San Jose, SA, Sep 2012.
K. Bare, S. Kavulya, J. Tan, X. Pan, E. Marinelli, M. Kasick, R. Gandhi, P. Narasimhan. ASDF: An Automated, Online Framework for Diagnosing Performance Problems. Architecting Dependable Systems, in Lecture Notes in Computer Science, Volume 6420/2010, No. 7, Pages 201-226, 2010.
J. Campbell, A. Ganesan, B. Gotow, S. Kavulya, J. Mulholland, P. Narasimhan, S. Ramasubramanian, M. Shuster, and J. Tan. Understanding and Improving the Diagnostic Workflow of MapReduce Users. In 5th ACM Symposium on Computer Human Interaction for Management of Information Technology (CHIMIT), Boston, MA, Dec 2011.
Y. Tan, B. Lee, H. Chak, J. Tan, J. Li, X. Xiao, E. Chng, S. Date, and A. Narishige. Hadoop Framework: Impact of Data Organization on Performance. In Software: Practice and Experience, 43(11), John Wiley and Sons Ltd., May 2011.
J. Tan, S. Kavulya, R. Gandhi, P. Narasimhan. Visual, Log-based Causal Tracing for Performance Debugging of MapReduce Systems. 30th IEEE International Conference on Distributed Computing Systems (ICDCS) 2010, Genoa, Italy, Jun 2010.
S. Kavulya, J. Tan, R. Gandhi, P. Narasimhan. An Analysis of Traces from a Production MapReduce Cluster. 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2010, Melbourne, Victoria, Australia, May 2010.
J. Tan, X. Pan, S. Kavulya, E. Marinelli, R. Gandhi, P. Narasimhan. Kahuna: Problem Diagnosis for MapReduce-based Cloud Computing Environments. 12th IEEE/IFIP Network Operations and Management Symposium (NOMS) 2010, Osaka, Japan, Apr 2010.
M. Kasick, J. Tan, R. Gandhi, P. Narasimhan. Black Box Problem Diagnosis in Parallel File Systems. 8th USENIX Conference on File and Storage Technologies (FAST) 2010, San Jose, CA, Feb 2010.
X. Pan, J. Tan, S. Kavulya, R. Gandhi, P. Narasimhan. Blind Men and the Elephant: Piecing Together Hadoop for Diagnosis. 20th IEEE International Symposium on Software Reliability Engineering (ISSRE), Industrial Track, Mysuru, India, Nov 2009.
J. Tan, X. Pan, S. Kavulya, R. Gandhi, P. Narasimhan. Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop. USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '09), San Diego, CA, Jun 2009.
X. Pan, S. Kavulya, J. Tan, R. Gandhi, P. Narasimhan. Ganesha: Black-Box Diagnosis for MapReduce Systems. Workshop on Hot Topics in Measurement & Modeling of Computer Systems (HotMetrics), Seattle, WA, Jun 2009.
M. Kasick, K. Bare, E. Marinelli, J. Tan, R. Gandhi, P. Narasimhan. System-Call Based Problem Diagnosis for PVFS. Workshop on Hot Topics in System Dependability (HotDep 2009), Estoril, Lisbon, Portugal, Jun 2009.
J. Tan, X. Pan, S. Kavulya, R. Gandhi, P. Narasimhan. SALSA: Analyzing Logs as StAte Machines. USENIX Workshop on Analysis of System Logs (WASL), San Diego, CA, Dec 2008.
T. Dumitras, J. Tan, Z. Gho and P. Narasimhan. No More HotDependencies: Toward Dependency-Agnostic Upgrades in Distributed Systems. In Workshop on Hot Topics in System Dependability (HotDep), Edinburgh, Scotland, Jun 2007.
J. Tan, J. Nahata. PETAL: Preset Encoding Table Information Leakage. Technical Report CMU-PDL-13-106, Carnegie Mellon University Parallel Data Laboratory, April 2013.
J. Tan. Log-based Approaches to Characterizing and Diagnosing MapReduce Systems. Carnegie Mellon University School of Computer Science Master's Thesis and Technical Report CMU-CS-09-143. Jul 2009.
X. Pan, J. Tan, S. Kavulya, R. Gandhi, P. Narasimhan. Ganesha: Black-box Fault Diagnosis for MapReduce Environments. Technical Report CMU-PDL-08-112, Carnegie Mellon University Parallel Data Laboratory, Sep 2008.
K. Bare, M. Kasick, S. Kavulya, E. Marinelli, X. Pan, J. Tan, R. Gandhi, P. Narasimhan. ASDF: Automated, Online Fingerpointing for Hadoop. Technical Report CMU-PDL-08-104, Carnegie Mellon University Parallel Data Laboratory, May 2008.
J. Tan, and P. Narasimhan. RAMS and BlackSheep: Inferring white-box application behavior using black-box techniques. Technical Report CMU-PDL-08-103, Carnegie Mellon University Parallel Data Laboratory, May 2008.
J. Tan. Hadoop Special Interest Group Meeting: Anatomy of a Hadoop MapReduce Program. Singapore Management University (SMU), Singapore. 25 September 2009.
J. Tan. Invited Talk: Applications, Visualization and Diagnosis of MapReduce Systems. Nanyang Technological University (NTU) School of Computer Engineering, Singapore. 23 September 2009.
P. Narasimhan, J. Tan. Automated Diagnosis of Problems in Hadoop. Hadoop Summit 2009, Santa Clara, CA. 10 June 2009.
J. Tan, P. Narasimhan. Automating Problem Diagnosis for MapReduce Environments. Yahoo! Technical Talk, Santa Clara, CA. 3 Oct 2008.
I was Senior Staff Photographer at The Tartan, Carnegie Mellon's student-run weekly broadsheet newspaper with a circulation of 5000, from August 2005 to August 2007. Here are some of the photos I have taken while on staff at The Tartan.
Currently, I am on the Photo Staff of The Thistle, Carnegie Mellon's student-run, student-published yearbook.
I also drop by on the many fun activities organised by the Singapore Students' Association, and catch up with fellow Singaporeans.
Email: tanjiaqi AT cmu DOT edu
Email: jiaqi.tan AT alumni DOT cmu DOT edu