[ Home ] [ Research ] [ Publications ] [ CV ] [ Links ]

Publications

My publications on DBLP and Google Scholar. The most up-to-date information can be found in the cv.

| 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 |

    2017

  1. Localized fault recovery for nested fork-join programs
    G. Kestor, S. Krishnamoorthy, and W. Ma
    Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS), April 2017
  2. Optimizing the four-index integral transform using data movement lower bounds analysis
    S. Rajbhandari, F. Rastello, S. Krishnamoorthy, K. Kowalski, and P. Sadayappan
    17th ACM SIGPLAN Annual Symposium on Principles and Practices of Parallel Programming (PPoPP), February 2017
  3. Exploiting vector and multicore parallelism for recursive data- and task-parallel programs
    B. Ren, S. Krishnamoorthy, K. Agrawal, and M. Kulkarni
    17th ACM SIGPLAN Annual Symposium on Principles and Practices of Parallel Programming (PPoPP), February 2017
  4. 2016

  5. User-assisted store recycling for dynamic task graph schedulers
    M. Kurt, S. Krishnamoorthy, G. Agrawal, and B. Ren
    ACM Transactions on Architecture and Code Optimization (TACO) vol:13(4), pp:55:1-55:24, December 2016
  6. Static and dynamic frequency scaling on multicore CPUs
    W. Bao, C. Hong, S. Krishnamoorthy, C. D. Sudheer, L.N. Pouchet, F. Rastello, and P. Sadayappan
    ACM Transactions on Architecture and Code Optimization (TACO) vol:13(4), pp:55:1-55:26, December 2016
  7. V. Sharma, G. Gopalakrishnan, and S. Krishnamoorthy
    PRESAGE: protecting structured address generation against soft errors
    International Conference on High Performance Computing, Data, and Analytics, December 2016
  8. S. Rajbhandari, J. Kim, S. Krishnamoorthy, L. Pouchet, F. Rastello, R. Harrison, and P. Sadayappan
    A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment
    Supercomputing (SC), November 2016
  9. On the impact of widening vector registers on sequence alignment
    J. Daily, A. Kalyanaraman, S. Krishnamoorthy, and B. Ren
    International Conference on Parallel Processing (ICPP), September 2016
  10. Effective padding of multi-dimensional arrays to avoid cache conflict misses
    C. Hong, W. Bao, A. Cohen, S. Krishnamoorthy, L. Pouchet, J. Ramanujam, F. Rastello, and P. Sadayappan
    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2016
  11. New-Sum: a novel online ABFT scheme for general iterative methods
    D. Tao, S. Song, S. Krishnamoorthy, P. Wu, X. Liang, E. Zhang, D. Kerbyson, and Z. Chen
    ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), May 2016
  12. On fusing recursive traversals of k-ary trees
    S. Rajbhandari, J. Kim, S. Krishnamoorthy, L. Pouchet, F. Rastello, R. Harrison, and P. Sadayappan
    International Conference on Compiler Construction, March 2016
  13. PolyCheck: dynamic verification of iteration space transformations on affine programs
    W. Bao, S. Krishnamoorthy, L. Pouchet, F. Rastello, and P. Sadayappan.
    ACM Symposium on Principles of Programming Languages (POPL), January 2016
  14. 2015

  15. CilkSpec: Optimistic Concurrency for Cilk
    S. Aga, S. Krishnamoorthy, S. Narayanasamy.
    Supercomputing (SC), November 2015
  16. Efficient execution of recursive programs on commodity vector hardware
    B. Ren, Y. Jo, S. Krishnamoorthy, K. Agrawal, and M. Kulkarni
    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2015
  17. A work stealing based approach for enabling scalable optimal sequence homology detection
    J. Daily, A. Kalyanaraman, S. Krishnamoorthy, and A. Vishnu
    Journal of Parallel and Distributed Computing vol:79-80, pp:132-142, May 2015
  18. On the impact of execution models: a case study in computational chemistry
    D. Chavarria, M. Halappanavar, S. Krishnamoorthy, J. Manzano, A. Vishnu, A. Hoisie
    Joint Workshop on High-Level Parallel Programming Models and supportive Environments and Large-Scale Parallel Processing (HIPS-LSPP), May 2015
  19. Global transformations for legacy parallel applications via structural analysis and rewriting
    D. Miranda, A. Panyala, W. Ma, A. Prantl, and S. Krishnamoorthy
    Parallel Computing vol:43, pp:1-26, March 2015
  20. 2014

  21. Communication-optimal framework for contracting distributed tensors Best Paper Finalist
    S. Rajbhandari, A. Nikam, P. Lai, K. Stock, S. Krishnamoorthy, and P. Sadayappan
    Supercomputing (SC), November 2014
  22. Fault-tolerant dynamic task graph scheduling Best Student Paper Finalist
    M. Kurt, S. Krishnamoorthy, K. Agrawal, and G. Agrawal
    Supercomputing (SC), November 2014
  23. Optimizing data locality for fork/join programs using constrained work stealing
    J. Lifflander, S. Krishnamoorthy, and L. Kale
    Supercomputing (SC), November 2014
  24. SCaLeM: a framework for characterizing and analyzing execution models
    D. Chavarria, J. Manzano, S. Krishnamoorthy, A. Vishnu, K. Barker, and A. Hoisie
    20 Years of Beowulf, October 2014
  25. Scalable replay with partial-order dependencies for message-logging fault tolerance Best Student Paper Award
    J. Lifflander, E. Meneses, H. Menon, P. Miller, S. Krishnamoorthy, and L. Kale
    IEEE CLUSTER, September 2014
  26. CAST: contraction algorithm for symmetric tensors
    S. Rajbhandari, A. Nikam, P. Lai, K. Stock, S. Krishnamoorthy, and P. Sadayappan
    International Conference on Parallel Processing, September 2014
  27. SCaLeM: a framework for characterizing and analyzing execution models
    D. Chavarria, J. Manzano, S. Krishnamoorthy, A. Vishnu, K. Barker, and A. Hoisie
    20 Years of Beowulf workshop, October 2014
  28. Checksumming strategies for data in volatile memories
    H. Arafat, S. Krishnamoorthy, and P. Sadayappan
    International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), September 2014
  29. Compiler-assisted detection of transient memory errors Paper(ACM DL)
    S. Tavarageri, S. Krishnamoorthy, and P. Sadayappan
    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2014
  30. Addressing failures in exascale computing
    M. Snir, R. Wisniewski, J. Abraham, S. Adve, S. Bagchi, P. Balaji, J. Belak, P. Bose, F. Cappello, B. Carlson, A. Chien, P. Coteus, N. Debardeleben, P. Diniz, C. Engelmann, M. Erez, S. Fazzari, A. Geist, R. Gupta, F. Johnson, S. Krishnamoorthy, S. Leyffer, D. Liberty, S. Mitra, T. Munson, R. Schreiber, J. Stearley, and E. Van Hensbergen
    International Journal of High Performance Computing Applications vol:28(2), pp:127-171, May 2014
  31. 2013

  32. A framework for load balancing of tensor contraction expressions via dynamic task partitioning
    P. Lai, K. Stock, S. Rajbhandari, S. Krishnamoorthy, and P. Sadayappan
    SC 2013, November 2013
  33. Efficient scheduling of recursive control flow on GPUs
    X. Huo, S. Krishnamoorthy, and G. Agrawal
    27th International Conference on Supercomputing (ICS), June 2013
  34. Steal Tree: low-overhead tracing of work stealing schedulers
    J. Lifflander, S. Krishnamoorthy, and L. Kale
    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2013
  35. Non-iterative multireference Coupled Cluster methods on heterogeneous CPU-GPU systems
    K. Bhaskaran-Nair, W. Ma, S. Krishnamoorthy, O. Villa, H. van Dam, E. Apra, and K. Kowalski
    Journal of Chemical Theory and Computation, 2013
  36. Multi-fault tolerance for Cartesian data distributions
    N. Ali, S. Krishnamoorthy, M. Halappanavar, and J. Daily
    International Journal of Parallel Programming, Computing Frontiers special issue vol:41(3) pp:469-493 2013
  37. 2012

  38. A scalable infrastructure for the performance analysis of passive target synchronization
    M. A. Hermanns, S. Krishnamoorthy, and F. Wolf
    Parallel Computing doi:10.1016/j.parco.2012.09.002, 2012
  39. Towards scalable optimal sequence homology detection
    J. Daily, S. Krishnamoorthy, and A. Kalyanaraman
    Workshop on Parallel Algorithms and Software for Analysis of Massive Graphs (ParGraph), December 2012
  40. On the use of term rewriting for performance optimization of legacy HPC
    A. Panyala, D. Chavarria, and S. Krishnamoorthy
    International Conference on Parallel Processing (ICPP), September 2012
  41. Work stealing and persistence-based load balancers for iterative overdecomposed applications
    J. Lifflander, S. Krishnamoorthy, and L. Kale
    ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), June 2012
  42. Data-driven fault tolerance for work stealing computations
    W. Ma and S. Krishnamoorthy
    26th International Conference on Supercomputing (ICS), June 2012
  43. Empirical performance model-driven data layout optimization and library call selection
    Q. Lu, X. Gao, S. Krishnamoorthy, G. Baumgartner, J. Ramanujam, and P. Sadayappan
    Journal of Parallel and Distributed Computing vol:72(3), pp:338-352, March 2012
  44. Performance characterization of global address space applications: a case study with NWChem
    J. Hammond, S. Krishnamoorthy, S. Shende, N. Romero, and A. Malony
    Concurrency and Computation: Practice and Experience vol:24(2), pp:135-154, 2012
  45. Parameterized micro-benchmarking: an auto-tuning approach for complex applications
    W. Ma, S. Krishnamoorthy, and G. Agrawal
    ACM International Conference on Computing Frontiers, June 2012
  46. Global Futures: a multithreaded execution model for Global Arrays-based applications
    D. Chavarria, S. Krishnamoorthy, and A. Vishnu
    IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2012
  47. Load balancing of dynamical nucleation theory monte carlo simulations through resource sharing barriers
    H. Arafat, J. Dinan, S. Krishnamoorthy, T. Windus, and P. Sadayappan
    IEEE International Parallel and Distributed Processing Symposium, May 2012
  48. Supporting the Global Arrays PGAS Model Using MPI One-Sided Communication
    J. Dinan, P. Balaji, J. Hammond, S. Krishnamoorthy, and V. Tipparaju
    IEEE International Parallel and Distributed Processing Symposium, May 2012
  49. 2011

  50. Power- and Cooling- Aware Parallel Performance Diagnostics
    RL. Knapp, KL. Karavanic, S. Krishnamoorthy, and A. Marquez
    The 23rd IASTED International Conference on Parallel and Distributed Computing and Systems, December 2011
  51. Scalable Implementations of Accurate Excited-state Coupled Cluster Theories: Application of High-level Methods to Porphyrin-based Systems
    K. Kowalski, S. Krishnamoorthy, R. Olson, V. Tipparaju, and E. Apra
    Supercomputing (SC), November 2011
  52. Optimizing Tensor Contraction Expressions for Hybrid CPU-GPU Execution
    W. Ma, S. Krishnamoorthy, O. Villa, K. Kowalski, and G. Agrawal
    Cluster Computing Special Issue
  53. Noncollective Communicator Creation in MPI
    J. Dinan, S. Krishnamoorthy, P. Balaji, J. Hammond, M. Krishnan, V. Tipparaju, and A. Vishnu
    Special Session on Improving MPI User And Developer Inter- action, EuroMPI, September 2011
  54. A scalable replay-based infrastructure for the performance analysis of one-sided communication
    M. Hermanns, S. Krishnamoorthy, and F. Wolf
    First International Workshop on High-performance Infrastructure for Scalable Tools (WHIST), May 2011
  55. Fault Oblivious eXascale Whitepaper
    E. Van Hensbergen, R. Minnich, C. Janssen, S. Krishnamoorthy, A. Marquez, M. Gokhale, P. Sadayappan, J. Mckie, and J. Appavo
    International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), May 2011
  56. Application-Specific Fault Tolerance via Data Access Characterization
    N. Ali, S. Krishnamoorthy, N. Govind, K. Kowalski, and P. Sadayappan
    Euro-Par 2011, August 2011
  57. Massively parallel implementation of the multi-reference Brillouin-Wigner CCSD method
    J. Brabec, S. Krishnamoorthy, HJJ. van Dam, K. Kowalski, and J. Pittner
    Chemical Physics Letters vol:514(4-6), pp:347-351, 2011
  58. The role of many-body effects in describing low-lying excited states of ∏-conjugated chromophores: high-level equation-of-motion coupled-cluster studies of fused porphyrin systems
    K. Kowalski, R. Olson, S. Krishnamoorthy, V. Tipparaju, and E. Apra
    Journal of Chemical Theory and Computation vol:7(7) pp:2200-2208, 2011
  59. GPU-Based Implementations of the Noniterative Regularized-CCSD(T) Corrections: Applications to Strongly Correlated Systems
    W. Ma, S. Krishnamoorthy, O. Villa, and K. Kowalski
    Journal of Chemical Theory and Computation, vol:7(5) pp:1316-1327, 2011
  60. Tolerating Correlated Failures for Generalized Cartesian Distributions via Bipartite Matching
    N. Ali, S. Krishnamoorthy, M. Halappanavar, and J. Daily
    ACM International Conference on Computing Frontiers (CF'11), May 2011
  61. Practical Loop Transformations for Tensor Contraction Expressions on Multi-Level Memory Hierarchies
    W. Ma, S. Krishnamoorthy, and G. Agrawal
    International Conference on Compiler Construction (CC'11), April 2011
  62. Lifeline-based Global Load Balancing
    V. Saraswat, P. Kambadur, S. Kodali, D. Grove, and S. Krishnamoorthy
    16th ACM SIGPLAN Annual Symposium on Principles and Practices of Parallel Programming (PPoPP'11), February 2011
  63. A Redundant Communication Approach to Scalable Fault Tolerance in PGAS Programming Models
    N. Ali, S. Krishnamoorthy, N. Govind, and B. Palmer
    19th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, February 2011
  64. 2010

  65. Efficient Sparse Matrix-Matrix Multiplication on Heterogeneous High Performance Systems
    J. Siegel, O. Villa, S. Krishnamoorthy, A. Tumeo, and X. Li
    Workshop on Application/Architecture Co-design for Extreme-scale Computing (AACEC). September 2010
  66. EOMCC, MRPT, and TDDFT Studies of Charge Transfer Processes in Mixed-Valence Compounds: Application to the Spiro Molecule
    KR. Glaesemann, N. Govind, S. Krishnamoorthy, and K. Kowalski
    Journal of Physical Chemistry A
  67. Acceleration of Streamed Tensor Contraction Expressions on GPGPU-based Clusters
    W. Ma, S. Krishnamoorthy, O. Villa, and K. Kowalski
    IEEE International Conference on Cluster Computing (CLUSTER). September 2010
  68. Active-space completely-renormalized equation-of-motion coupled-cluster formalism: excited-state studies of green flourescent protein, free-base porphyrin, and oligoporphyrin dimer
    K. Kowalski, S. Krishnamoorthy, O. Villa, J. Hammond, and N. Govind
    The Journal Of Chemical Physics 132(15)-154103
  69. Load Balancing on Single- and Multi-GPU Systems
    L. Chen, O. Villa, S. Krishnamoorthy, and G. Gao
    Proceedings of the 24th IEEE International Parallel & Distributed Processing Symposium (IPDPS), April 2010
  70. Selective Recovery From Failures In A Task Parallel Programming Model
    J. Dinan, A. Singri, P. Sadayappan, and S. Krishnamoorthy
    Proceedings of the The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing -- Resilience Workshop. May 2010
  71. Scalable communication trace compression
    S. Krishnamoorthy and K. Agarwal
    Proceedings of the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CLUSTER). May 2010
  72. High Performance Molecular Dynamic Simulation on Single and Multi-GPU Systems
    O. Villa, L. Chen, and S. Krishnamoorthy
    Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS) 2010
  73. 2009

  74. Performance Optimization of Tensor Contraction Expressions for Many-Body methods in Quantum Chemistry
    A. Hartono, Q. Lu, T. Henretty, S. Krishnamoorthy, H. Zhang, G. Baumgartner, D.E. Bernholdt, M. Nooijen, R. Pitzer, J. Ramanujam, and P. Sadayappan
    Journal of Physical Chemistry A 113(45), pp.12715-12723
  75. Scalable Work Stealing
    J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, P. Sadayappan
    Supercomputing (SC) 2009, November 2009
  76. Data Layout Transformation for Enhancing Locality on NUCA Chip Multiprocessors
    Q. Lu, C. Alias, U. Bondhugula, T. Henretty, S. Krishnamoorthy, J. Ramanujam, A. Rountev, P. Sadayappan, Y. Chen, H. Lin, and T. Ngai
    18th International Symposium on Parallel Architectures and Compilation Techniques (PACT-18), September 2009
  77. Parametric multi-level tiling of imperfectly nested loops
    A. Hartono, M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, P. Sadayappan
    ICS 2009: 147-157 , June 2009. BibTeX
  78. An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications
    N. Vydyanathan, S. Krishnamoorthy, G.M. Sabin, U.V. Catalyurek, T.M. Kurc, P. Sadayappan, J.H. Saltz
    IEEE Transasctions on Parallel Distributed Systems 20(8): 1158-1172 2009 BibTeX
  79. Scalable transparent checkpoint-restart of global address space applications on virtual machines over infiniband
    O. Villa, S. Krishnamoorthy, J. Nieplocha, D.M. Brown Jr.
    Conference on Computing Frontiers 2009, April 2009. BibTeX
  80. 2008

  81. Global trees: a framework for linked data structures on distributed memory parallel systems
    B. Larkins, J. Dinan, S. Krishnamoorthy, S. Parthasarathy, A. Rountev, P. Sadayappan
    Supercomputing (SC) 2008, November 2008. BibTeX
  82. Solving large, irregular graph problems using adaptive work-stealing
    G. Cong, S. Kodali, S. Krishnamoorthy, D. Lea, V, Saraswat, T. Wen
    Proceedings of the International Conference on Parallel Processing (ICPP'08), September 2008. BibTeX
  83. Scioto: a framework for global-view task parallelism
    J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, and P. Sadayappan
    Proceedings of the International Conference on Parallel Processing (ICPP'08), September 2008. BibTeX
  84. A compiler framework for optimization of affine loop nests for GPGPUs
    M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan
    Proceedings of the International Conference on Supercomputing (ICS'08), June 2008, Island of Kos, Greece. BibTeX
  85. Integrated Data and Task Management for Scientific Applications
    J. Nieplocha, S. Krishamoorthy, M. Valiev , M. Krishnan , B. Palmer , and P. Sadayappan
    Proceedings of the 8th International Conference on Computational Science (ICCS 2008),June 2008, Krakow, Poland. BibTeX
  86. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
    Uday Bondhugula, Muthu Manikandan Baskaran, S. Krishnamoorthy, J. Ramanujam, A.Rountev, and P. Sadayappan
    Proceedings of the International Conference on Compiler Construction (ETAPS CC'08) April 2008, Budapest, Hungary. BibTeX
  87. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
    M. Baskaran, Uday Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan.
    ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'08) February 2008 BibTeX
  88. 2007

  89. Efficient search-space pruning for integrated fusion and tiling transformations
    X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, and P. Sadayappan.
    Concurrency and Computation: Practice and Experience, 2007 BibTeX
  90. Non-collective parallel I/O for global address space programming models
    S. Krishnamoorthy, J. P. Canovas, V. Tipparaju, J. Nieplocha, and P. Sadayappan.
    Procedings of the International Conference on Cluster Computing (CLUSTER 2007). September 2007 BibTeX
  91. Effective automatic parallelization of stencil computations
    S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan.
    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2007). June 2007 BibTeX
  92. 2006

  93. Hypergraph partitioning for automatic memory hierarchy management
    S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan.
    Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006). November 2006 BibTeX
  94. Design and implementation of a one-sided communication interface for the IBM eserver blue gene supercomputer
    Michael Blocksome, Charles Archer, Todd Inglett, Pat McCarthy, Mike Mundy, Joe Ratterman, Albert Sidelnik, Brian Smith, Gheorghe Almasi, Jose Castanos, Derek Lieber, Jose Moreira, Sriram Krishnamoorthy, and Vinod Tipparaju
    Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006). November 2006 BibTeX
  95. Locality conscious processor allocation and scheduling for mixed-parallel applications
    N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz.
    Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER 2006). September 2006 BibTeX
  96. Combining analytical and empirical approaches in tuning matrix transposition
    Q. Lu, S. Krishnamoorthy, and P. Sadayappan.
    Proceedings of the 15th International Conference on Parallel Architectures and Compiler Techniques. (PACT 2006) BibTeX
  97. An integrated approach for processor allocation and scheduling of mixed-parallel applications
    N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz.
    The 35th International Conference on Parallel Processing (ICPP 2006) BibTeX
  98. Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations
    A. Hartono, Q. Lu, X. Gao, S. Krishnamoorthy, M. Nooijen, G. Baumgartner, V. Choppella, D. E. Bernholdt, R. M. Pitzer, J. Ramanujam, A. Rountev, and P. Sadayappan.
    The 6th International Conference on Computational Science (ICCS 2006) BibTeX
  99. An approach to locality-conscious load balancing and transparent memory hierarchy management with a global-address-space parallel programming model
    S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan.
    IPDPS Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL 2006) BibTeX
  100. An extensible global address space framework with decoupled task and data abstractions
    S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan.
    IPDPS Workshop on Next Generation Software (NGS 2006). BibTeX
  101. Layout transformation support for the disk resident arrays framework
    S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha, and P. Sadayappan.
    Journal of Supercomputing. vol: 36(2) pp:153-170 May 2006 BibTeX
  102. Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver
    Sandhya Krishnan, Sriram Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, P. Sadayappan, and Venkatesh Chopella
    Journal of Parallel and Distributed Computing (IPDPS Special Issue) vol:66(5) pp:659-673. May 2006 BibTeX
  103. Search-based performance-model driven optimization for compilation of tensor contraction expressions
    X. Gao, S. Krishnamoorthy, Q. Lu, V. Choppella, G. Baumgartner, J. Ramanujam, and P. Sadayappan.
    The 12th Workshop on Compilers for Parallel Computers (CPC 2006). Coruna, Spain. BibTeX
  104. Task scheduling and file replication for data-intensive jobs with batch-shared i/o
    G. Khanna, N. Vydyanathan, U. Catalyurek, T. Kurc, S. Krishnamoorthy, P. Sadayappan, J. Saltz
    The 15th IEEE International Symposium on High Performance Distributed Computing (HPDC 2006) BibTeX
  105. Automatic code generation for many-body electronic structure methods: the tensor contraction engine
    A. Auer, G. Baumgartner, D. E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Krishnamoorthy, S. Krishnan, C. Lam, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan and A. Sibiryakov.
    Molecular Physics vol:104(2), pp:211-228. January 2006 BibTeX
  106. 2005

  107. Data and computation abstractions for dynamic and irregular computations
    S. Krishnamoorthy, J. Nieplocha, P. Sadayappan.
    The 12th Annual International Conference on High Performance Computing (HiPC 2005) BibTeX
  108. Integrated loop optimizations for data locality enhancement of tensor contraction expressions
    S. K. Sahoo, S. Krishnamoorthy, R. Panuganti, P. Sadayappan.
    Supercomputing (SC 2005) BibTeX
  109. Efficient search-space pruning for integrated fusion and tiling transformations
    X. Gao, S. Krishnamoorthy, S. K. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, P. Sadayappan.
    The 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2005) BibTeX
  110. Locality-aware load balancing for dynamic and irregular computations
    S. Krishnamoorthy, P. Sadayappan, J. Nieplocha, and M. Krishnan
    Workshop on Patterns in High Performance Computing. May 2005
  111. Cache miss characterization and data locality optimization for imperfectly nested loops on shared memory multiprocessors
    S. K. Sahoo, R. Panuganti, S. Krishnamoorthy, P. Sadayappan.
    19th IEEE International Parallel & Distributed Processing Symposium. (IPDPS 2005) BibTeX
  112. Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models
    G. Baumgartner, A. Auer, D.E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R.J. Harrison, S. Hirata, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R.M. Pitzer, J. Ramanujam, P. Sadayappan and A. Sibiryakov.
    Proceedings of the IEEE. vol: 93(2) pp:276-292 February 2005. BibTeX
  113. 2004

  114. Layout transformation support for the disk resident arrays framework
    S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha and P. Sadayappan.
    The Los Alamos Computer Science Initiative Symposium. (LACSI 2004) BibTeX
  115. Efficient layout transformation support for disk-based multidimensional arrays
    S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha and P. Sadayappan.
    The 11th Annual International Conference on High Performance Computing. (HiPC 2004) BibTeX
  116. Efficient parallel out-of-core matrix transposition S. Krishnamoorthy, G. Baumgartner, Daniel Cociorva, C. Lam and P. Sadayappan.
    International Journal of High Performance Computing and Networking. vol:2(2/3/4) pp:110-119 2004 BibTeX
  117. Empirical performance-model driven data layout optimization
    Q. Lu, X. Gao, S. Krishnamoorthy, G. Baumgartner, J. Ramanujam and P. Sadayappan.
    The 17th International Workshop on Languages and Compilers for Parallel Computing. (LCPC 2004) BibTeX
  118. Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver Best Paper Award
    S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Ramanujam, P. Sadayappan and V. Choppella.
    The 18th International Parallel & Distributed Processing Symposium. (IPDPS 2004). BibTeX
  119. 2003

  120. Data locality optimization for synthesis of efficient out-of-core algorithms Best Paper Award
    Sandhya Krishnan, Sriram Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan, J. Ramanujam, David E. Bernholdt and V. Choppella.
    The 10th Annual International Conference on High Performance Computing. (HiPC 2003). December 2003. BibTeX
  121. Efficient parallel out-of-core matrix transposition
    S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam and P. Sadayappan.
    IEEE International Conference on Cluster Computing (CLUSTER 2003). December 2003 BibTeX

Technical Reports

  1. Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories
    M. Manikandan Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CISRC-2/08-TR05
  2. Affine transformations for communication minimal parallelization and locality optimization of arbitrarily nested loop sequences
    Uday Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan.
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CIRSC-5/07-TR43
  3. A Compiler Framework for Optimization of Affine Loop Nests for General Purpose Computations on GPUs
    M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CISRC-12/07-TR78
  4. An integrated approach for processor allocation and scheduling of mixed-parallel applications
    N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and Joel Saltz.
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CIRSC-2/06-TR20
  5. On efficient out-of-core matrix transposition
    S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam and P. Sadayappan.
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CIRSC-9/03-TR52

Invited Papers

  1. Towards effective automatic parallelization for multicore systems
    Uday Bondhugula, Muthu Baskaran, Albert Hartono, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev and P. Sadayappan
    Proceedings of the IPDPS Workshop on Next Generation Software (NSF-NGS 2008). April 2008 BibTeX
  2. A global adress space framework for locality aware scheduling of block-sparse computations
    S. krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan.
    Proceedings of the IPDPS Workshop on Next Generation Software (NSF-NGS 2007). April 2007 BibTeX

Posters

  1. Scalable Fault Tolerance in PGAS Programming Models
    Nawab Ali, Sriram Krishnamoorthy, Niranjan Govind, Bruce Palmer, and Oreste Villa
    Supercomputing 2010. November 2010
  2. Parallel global address space framework with multiple inter-operable abstractions
    Sriram Krishnamoorthy, Brian Larkins, Atanas Rountev, P. Sadayappan, Jarek Nieplocha, and Robert J. Harrison
    The second conference on Partitioned Global Address Space Programming Models(PGAS 2006). October 2006
  3. Web service pipelining
    New Melchizedec, S. Krishnamoorthy, Vimal Kumar Vivekananthamoorthy, and Arul Siromoney.
    The 8th Annual International Conference on High Performance Computing. (HiPC 2001). December 2001


[ Home ] [ Research ] [ Publications ] [ CV ] [ Links ]

Under construction. Last modified: 5-May-2008