[ Home ] [ Research ] [ Publications ] [ Links ]

Publications

| 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 |

    2009

  1. Performance Optimization of Tensor Contraction Expressions for Many Body Methods in Quantum Chemistry
    Q. Lu, A. Hartono, T. Henretty, S. Krishnamoorthy, H. Zhang, G. Baumgartner, D.E. Bernholdt, M. Nooijen, R.M. Pitzer, J. Ramanujam, and P. Sadayappan
    The Journal of Physical Chemistry (accepted)
  2. Scalable Work Stealing
    J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, P. Sadayappan
    Supercomputing (SC) 2009 (To Appear), November 2009
  3. Data Layout Transformation for Enhancing Locality on NUCA Chip Multiprocessors
    Q. Lu, C. Alias, U. Bondhugula, T. Henretty, S. Krishnamoorthy, J. Ramanujam, A. Rountev, P. Sadayappan, Y. Chen, H. Lin, and T. Ngai
    18th International Symposium on Parallel Architectures and Compilation Techniques (PACT-18) (To Appear), September 2009
  4. Parametric multi-level tiling of imperfectly nested loops
    A. Hartono, M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, P. Sadayappan
    ICS 2009: 147-157 , June 2009. BibTeX
  5. An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications
    N. Vydyanathan, S. Krishnamoorthy, G.M. Sabin, U.V. Catalyurek, T.M. Kurc, P. Sadayappan, J.H. Saltz
    IEEE Transasctions on Parallel Distributed Systems 20(8): 1158-1172 2009 BibTeX
  6. Scalable transparent checkpoint-restart of global address space applications on virtual machines over infiniband
    O. Villa, S. Krishnamoorthy, J. Nieplocha, D.M. Brown Jr.
    Conference on Computing Frontiers 2009, April 2009. BibTeX
  7. 2008

  8. Global trees: a framework for linked data structures on distributed memory parallel systems
    B. Larkins, J. Dinan, S. Krishnamoorthy, S. Parthasarathy, A. Rountev, P. Sadayappan
    Supercomputing (SC) 2008, November 2008. BibTeX
  9. Solving large, irregular graph problems using adaptive work-stealing
    G. Cong, S. Kodali, S. Krishnamoorthy, D. Lea, V, Saraswat, T. Wen
    Proceedings of the International Conference on Parallel Processing (ICPP'08), September 2008. BibTeX
  10. Scioto: a framework for global-view task parallelism
    J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, and P. Sadayappan
    Proceedings of the International Conference on Parallel Processing (ICPP'08), September 2008. BibTeX
  11. A compiler framework for optimization of affine loop nests for GPGPUs
    M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan
    Proceedings of the International Conference on Supercomputing (ICS'08), June 2008, Island of Kos, Greece. BibTeX
  12. Integrated Data and Task Management for Scientific Applications
    J. Nieplocha, S. Krishamoorthy, M. Valiev , M. Krishnan , B. Palmer , and P. Sadayappan
    Proceedings of the 8th International Conference on Computational Science (ICCS 2008),June 2008, Krakow, Poland. BibTeX
  13. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
    Uday Bondhugula, Muthu Manikandan Baskaran, S. Krishnamoorthy, J. Ramanujam, A.Rountev, and P. Sadayappan
    Proceedings of the International Conference on Compiler Construction (ETAPS CC'08) April 2008, Budapest, Hungary. BibTeX
  14. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
    M. Baskaran, Uday Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan.
    ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'08) February 2008 BibTeX
  15. 2007

  16. Efficient search-space pruning for integrated fusion and tiling transformations
    X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, and P. Sadayappan.
    Concurrency and Computation: Practice and Experience, 2007 BibTeX
  17. Non-collective parallel I/O for global address space programming models
    S. Krishnamoorthy, J. P. Canovas, V. Tipparaju, J. Nieplocha, and P. Sadayappan.
    Procedings of the International Conference on Cluster Computing (CLUSTER 2007). September 2007 BibTeX
  18. Effective automatic parallelization of stencil computations
    S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan.
    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2007). June 2007 BibTeX
  19. 2006

  20. Hypergraph partitioning for automatic memory hierarchy management
    S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan.
    Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006). November 2006 BibTeX
  21. Design and implementation of a one-sided communication interface for the IBM eserver blue gene supercomputer
    Michael Blocksome, Charles Archer, Todd Inglett, Pat McCarthy, Mike Mundy, Joe Ratterman, Albert Sidelnik, Brian Smith, Gheorghe Almasi, Jose Castanos, Derek Lieber, Jose Moreira, Sriram Krishnamoorthy, and Vinod Tipparaju
    Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006). November 2006 BibTeX
  22. Locality conscious processor allocation and scheduling for mixed-parallel applications
    N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz.
    Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER 2006). September 2006 BibTeX
  23. Combining analytical and empirical approaches in tuning matrix transposition
    Q. Lu, S. Krishnamoorthy, and P. Sadayappan.
    Proceedings of the 15th International Conference on Parallel Architectures and Compiler Techniques. (PACT 2006) BibTeX
  24. An integrated approach for processor allocation and scheduling of mixed-parallel applications
    N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz.
    The 35th International Conference on Parallel Processing (ICPP 2006) BibTeX
  25. Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations
    A. Hartono, Q. Lu, X. Gao, S. Krishnamoorthy, M. Nooijen, G. Baumgartner, V. Choppella, D. E. Bernholdt, R. M. Pitzer, J. Ramanujam, A. Rountev, and P. Sadayappan.
    The 6th International Conference on Computational Science (ICCS 2006) BibTeX
  26. An approach to locality-conscious load balancing and transparent memory hierarchy management with a global-address-space parallel programming model
    S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan.
    IPDPS Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL 2006) BibTeX
  27. An extensible global address space framework with decoupled task and data abstractions
    S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan.
    IPDPS Workshop on Next Generation Software (NGS 2006). BibTeX
  28. Layout transformation support for the disk resident arrays framework
    S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha, and P. Sadayappan.
    Journal of Supercomputing. vol: 36(2) pp:153-170 May 2006 BibTeX
  29. Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver
    Sandhya Krishnan, Sriram Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, P. Sadayappan, and Venkatesh Chopella
    Journal of Parallel and Distributed Computing (IPDPS Special Issue) vol:66(5) pp:659-673. May 2006 BibTeX
  30. Search-based performance-model driven optimization for compilation of tensor contraction expressions
    X. Gao, S. Krishnamoorthy, Q. Lu, V. Choppella, G. Baumgartner, J. Ramanujam, and P. Sadayappan.
    The 12th Workshop on Compilers for Parallel Computers (CPC 2006). Coruna, Spain. BibTeX
  31. Task scheduling and file replication for data-intensive jobs with batch-shared i/o
    G. Khanna, N. Vydyanathan, U. Catalyurek, T. Kurc, S. Krishnamoorthy, P. Sadayappan, J. Saltz
    The 15th IEEE International Symposium on High Performance Distributed Computing (HPDC 2006) BibTeX
  32. Automatic code generation for many-body electronic structure methods: the tensor contraction engine
    A. Auer, G. Baumgartner, D. E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Krishnamoorthy, S. Krishnan, C. Lam, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan and A. Sibiryakov.
    Molecular Physics vol:104(2), pp:211-228. January 2006 BibTeX
  33. 2005

  34. Data and computation abstractions for dynamic and irregular computations
    S. Krishnamoorthy, J. Nieplocha, P. Sadayappan.
    The 12th Annual International Conference on High Performance Computing (HiPC 2005) BibTeX
  35. Integrated loop optimizations for data locality enhancement of tensor contraction expressions
    S. K. Sahoo, S. Krishnamoorthy, R. Panuganti, P. Sadayappan.
    Supercomputing (SC 2005) BibTeX
  36. Efficient search-space pruning for integrated fusion and tiling transformations
    X. Gao, S. Krishnamoorthy, S. K. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, P. Sadayappan.
    The 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2005) BibTeX
  37. Locality-aware load balancing for dynamic and irregular computations
    S. Krishnamoorthy, P. Sadayappan, J. Nieplocha, and M. Krishnan
    Workshop on Patterns in High Performance Computing. May 2005
  38. Cache miss characterization and data locality optimization for imperfectly nested loops on shared memory multiprocessors
    S. K. Sahoo, R. Panuganti, S. Krishnamoorthy, P. Sadayappan.
    19th IEEE International Parallel & Distributed Processing Symposium. (IPDPS 2005) BibTeX
  39. Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models
    G. Baumgartner, A. Auer, D.E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R.J. Harrison, S. Hirata, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R.M. Pitzer, J. Ramanujam, P. Sadayappan and A. Sibiryakov.
    Proceedings of the IEEE. vol: 93(2) pp:276-292 February 2005. BibTeX
  40. 2004

  41. Layout transformation support for the disk resident arrays framework
    S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha and P. Sadayappan.
    The Los Alamos Computer Science Initiative Symposium. (LACSI 2004) BibTeX
  42. Efficient layout transformation support for disk-based multidimensional arrays
    S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha and P. Sadayappan.
    The 11th Annual International Conference on High Performance Computing. (HiPC 2004) BibTeX
  43. Efficient parallel out-of-core matrix transposition S. Krishnamoorthy, G. Baumgartner, Daniel Cociorva, C. Lam and P. Sadayappan.
    International Journal of High Performance Computing and Networking. vol:2(2/3/4) pp:110-119 2004 BibTeX
  44. Empirical performance-model driven data layout optimization
    Q. Lu, X. Gao, S. Krishnamoorthy, G. Baumgartner, J. Ramanujam and P. Sadayappan.
    The 17th International Workshop on Languages and Compilers for Parallel Computing. (LCPC 2004) BibTeX
  45. Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver Best Paper Award
    S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Ramanujam, P. Sadayappan and V. Choppella.
    The 18th International Parallel & Distributed Processing Symposium. (IPDPS 2004). BibTeX
  46. 2003

  47. Data locality optimization for synthesis of efficient out-of-core algorithms Best Paper Award
    Sandhya Krishnan, Sriram Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam, P. Sadayappan, J. Ramanujam, David E. Bernholdt and V. Choppella.
    The 10th Annual International Conference on High Performance Computing. (HiPC 2003). December 2003. BibTeX
  48. Efficient parallel out-of-core matrix transposition
    S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam and P. Sadayappan.
    IEEE International Conference on Cluster Computing (CLUSTER 2003). December 2003 BibTeX

Technical Reports

  1. Affine transformations for communication minimal parallelization and locality optimization of arbitrarily nested loop sequences
    Uday Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan.
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CIRSC-5/07-TR43
  2. An integrated approach for processor allocation and scheduling of mixed-parallel applications
    N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and Joel Saltz.
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CIRSC-2/06-TR20
  3. On efficient out-of-core matrix transposition
    S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam and P. Sadayappan.
    Department of Computer and Information Science, Ohio State University. Technical Report OSU-CIRSC-9/03-TR52

Invited Papers

  1. Towards effective automatic parallelization for multicore systems
    Uday Bondhugula, Muthu Baskaran, Albert Hartono, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev and P. Sadayappan
    Proceedings of the IPDPS Workshop on Next Generation Software (NSF-NGS 2008). April 2008 BibTeX
  2. A global adress space framework for locality aware scheduling of block-sparse computations
    S. krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan.
    Proceedings of the IPDPS Workshop on Next Generation Software (NSF-NGS 2007). April 2007 BibTeX

Posters

  1. Parallel global address space framework with multiple inter-operable abstractions
    Sriram Krishnamoorthy, Brian Larkins, Atanas Rountev, P. Sadayappan, Jarek Nieplocha, and Robert J. Harrison
    The second conference on Partitioned Global Address Space Programming Models(PGAS 2006). October 2006
  2. Web service pipelining
    New Melchizedec, S. Krishnamoorthy, Vimal Kumar Vivekananthamoorthy, and Arul Siromoney.
    The 8th Annual International Conference on High Performance Computing. (HiPC 2001). December 2001


[ Home ] [ Research ] [ Publications ] [ Links ]

Under construction. Last modified: 5-May-2008