Papers
Referring GA in a Publication
Please cite the following when referencing GA in a publication
-
Jarek Nieplocha, Bruce Palmer, Vinod Tipparaju, Manojkumar Krishnan, Harold Trease, and Edo Apra.
2006.
"Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit",
International Journal of High Performance Computing Applications, Vol. 20, No. 2, 203-231p.
Paper (PDF, 670KB).
Please refer to journal's website for the abstract and the final formatted manuscript). -
Manojkumar Krishnan, Bruce Palmer, Abhinav Vishnu,
Sriram Krishnamoorthy, Jeff Daily, and Daniel Chavarria.
2012.
The Global Arrays User Manual.
Paper (PDF, 616KB). - Global Arrays Webpage. http://hpc.pnl.gov/globalarrays
Referring ComEx in a Publication
- J. Daily, A. Vishnu, B. Palmer, H. van Dam, D. Kerbyson. 2014. "On the suitability of MPI as a PGAS runtime." 21st International Conference on High Performance Computing (HiPC). (pdf), (bib).
- ComEx Webpage. http://hpc.pnl.gov/comex.
Papers Related to the GA, ComEx and ARMCI (deprecated) Projects
2015
- N. Tallent, A. Vishnu, H. Van Dam, J. Daily, D. Kerbyson, and A. Hoisie. 2015. Diagnosing the causes and severity of one-sided message contention. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2015). ACM, New York, NY, USA, 130-139.
2014
- J. Daily, A. Vishnu, B. Palmer, H. van Dam, D. Kerbyson. 2014. "On the suitability of MPI as a PGAS runtime." 21st International Conference on High Performance Computing (HiPC).
2013
- A. Vishnu, D. Kerbyson, K. Barker, and H. van Dam. 2013. "Building Scalable Communication Subsystem on Blue Gene/Q." International Workshop on Communication Architecture on Scalable Systems (CASS), International Parallel and Distributed Processing Symposium (IPDPS).
2012
- A. Vishnu, J. Daily, B. Palmer. 2012. "Designing scalable PGAS communication subsystems on cray gemini interconnect." International Conference on High Performance Computing (HiPC), India.
- A. Vishnu , S. Song, A. Marquez, K. Barker, D. Kerbyson and P. Balaji. 2012. "Designing Energy Efficient Communication Runtime Systems: A View From PGAS Models." Special Issue on Green Computing and Communications, Journal of Supercomputing (JoSC).
- N. Ali, S. Krishnamoorthy, M. Halappanavar, J. Daily. 2012. "Multi-failure Tolerance for Cartesian Data Distributions." International Journal of Parallel Programming, Computing Frontiers special issue.
- J. Hammond, S. Krishnamoorthy, S. Shende, N. Romero, and A. Malony. 2012. "Performance Characterization of Global Address Space Applications: A Case Study with NWChem." Concurrency and Computation: Practice and Experience. vol:24(2), pp:135-154.
- W. Ma, S. Krishnamoorthy, O. Villa, and K. Kowalski. 2012. "Optimizing Tensor Contraction Expressions for Hybrid CPU-GPU Execution." Cluster Computing Special Issue.
- D. Chavarria, S. Krishnamoorthy, A. Vishnu. May 2012. "Global Futures: a multithreaded execution model for Global Arrays-based applications." IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
- H. Arafat, J. Dinan, S. Krishnamoorthy, T. Windus, P. Sadayappan. May 2012. "Load Balancing of Dynamical Nucleation Theory Monte Carlo Simulations through Resource Sharing Barriers." IEEE International Parallel & Distributed Processing Symposium.
- J. Dinan, P. Balaji, J. Hammond, S. Krishnamoorthy, V. Tipparaju. May 2012. "Supporting the Global Arrays PGAS Model Using MPI One-Sided Communication." IEEE International Parallel & Distributed Processing Symposium.
2011
- K. Kowalski, S. Krishnamoorthy, R. Olson, V. Tipparaju, E. Apra. November 2011. "Scalable Implementations of Accurate Excited-state Coupled Cluster Theories: Application of High-level Methods to Porphyrin-based Systems." International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
- J. Dinan, S. Krishnamoorthy, P. Balaji, J. Hammond, M. Krishnan, V. Tipparaju, A. Vishnu. September 2011. "Noncollective Communicator Creation in MPI", Special Session on Improving MPI User And Developer Interaction, EuroMPI.
- N. Ali, S. Krishnamoorthy, N. Govind, K. Kowalski, P. Sadayappan. August 2011. "Application-Specific Fault Tolerance via Data Access Characterization." Euro-Par 2011.
- J. Brabec, S. Krishnamoorthy, HJJ. van Dam, K. Kowalski, J. Pittner. August 2011. "Massively parallel implementation of the multi-reference Brillouin-Wigner CCSD method." Chemical Physics Letters vol:514(4-6), pp:347-351.
- A. Vishnu , R. Olson, and M. Bruggencate. August 2011. "Evaluating the Potential of Cray Gemini Interconnect for PGAS Models." International Symposium on High-Performance Interconnects (HotI), Santa Clara.
-
Jeff Daily and Robert R. Lewis.
July 2011.
"Using the Global Arrays Toolkit to Reimplement NumPy for Distributed Computation."
Proceedings of the 10th Annual Python in Science Conference (SciPy 2011).
Paper (PDF, 283KB). - M. Hermanns, S. Krishnamoorthy, F. Wolf. May 2011. "A scalable replay-based infrastructure for the performance analysis of one-sided communication." First International Workshop on High-performance Infrastructure for Scalable Tools (WHIST).
- E. Van Hensbergen, R. Minnich, C. Janssen, S. Krishnamoorthy, A. Marquez, M. Gokhale, P. Sadayappan, J. Mckie, J. Appavo. May 2011. "Fault Oblivious eXascale Whitepaper." International Workshop on Runtime and Operating Systems for Supercomputers (ROSS).
- W. Ma, S. Krishnamoorthy, O. Villa, K. Kowalski. May 2011. "GPU-Based Implementations of the Noniterative Regularized-CCSD(T) Corrections: Applications to Strongly Correlated Systems." Journal of Chemical Theory and Computation, vol:7(5) pp:1316-1327.
- N. Ali, S. Krishnamoorthy, M. Halappanavar, and J. Daily. May 2011. "Tolerating Correlated Failures for Generalized Cartesian Distributions via Bipartite Matching." ACM International Conference on Computing Frontiers (CF'11).
- A. Vishnu , M. Krishnan, and P. Balaji. May 2011. Dynamic Time-Variant Connection Management for PGAS Models on InfiniBand. Workshop on Communication Architecture for Scalable Systems (CASS), International Parallel and Distributed Processing Symposium (IPDPS), Alaska, 2011.
- N. Ali, S. Krishnamoorthy, N. Govind, B. Palmer. February 2011. "A Redundant Communication Approach to Scalable Fault Tolerance in PGAS Programming Models." 19th Euromicro International Conference on Parallel, Distributed and Network-Based Computing.
- V. Saraswat, P. Kambadur, S. Kodali, D. Grove, S. Krishnamoorthy. February 2011. "Lifeline-based Global Load Balancing." 16th ACM SIGPLAN Annual Symposium on Principles and Practices of Parallel Programming.
- H. V. Dam, A. Vishnu , and W. D. Jong. January 2011. "Designing A Scalable Fault Tolerance Model for Computational Chemistry: A Case Study with Coupled Cluster Perturbative Triples." Journal of Chemical Theory and Computation.
2010
- A. Vishnu , S. Song, A. Marquez, K. Barker, D. Kerbyson and P. Balaji. December 2010. "Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models." International Conference on Green Computing and Communications (GreenCom), China.
- A. Vishnu , H. V. Dam, W. D. Jong, P. Balaji, S. Song. December 2010. "Fault Tolerant Communication Runtime Support for Data Centric Programming Models." International Conference on High Performance Computing (HiPC), India.
- J. Siegel, O. Villa, S. Krishnamoorthy, A. Tumeo, and X. Li. September 2010. "Efficient Sparse Matrix-Matrix Multiplication on Heterogeneous High Performance Systems." Workshop on Application/Architecture Co-design for Extreme-scale Computing (AACEC).
- W. Ma, S. Krishnamoorthy, O. Villa, and K. Kowalski. September 2010. "Acceleration of Streamed Tensor Contraction Expressions on GPGPU-based Clusters." IEEE International Conference on Cluster Computing (CLUSTER).
- M. Krishnan, B. Lewis, A. Vishnu. September 2010. "Scaling Linear Algebra Kernels Using Remote Memory Access." Third International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), held in conjunction with International Conference on Parallel Processing (ICPP), San Diego.
- A. Vishnu , and M. Krishnan. May 2010. Efficient On-demand Connection Management Mechanisms with PGAS Models on InfiniBand." International Symposium on Cluster Computing and Grid Computing (CCGrid), Melbourne, Australia.
- J. Dinan, A. Singri, P. Sadayappan, and S. Krishnamoorthy. May 2010. "Selective Recovery From Failures In A Task Parallel Programming Model." Proceedings of the The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing - Resilience Workshop.
- K. Kowalski, S. Krishnamoorthy, O. Villa, J. Hammond, and N. Govind. April 2010. "Active-space completely-renormalized equation-of-motion coupled-cluster formalism: excited-state studies of green flourescent protein, free-base porphyrin, and oligoporphyrin dimer." The Journal Of Chemical Physics 2010 132(15)-154103.
2009
- J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, P. Sadayappan. November 2009. "Scalable Work Stealing." Supercomputing (SC) 2009.
- A. Vishnu , M. Krishnan and D. K. Panda. August 2009. "An Efficient Hardware-Software Approach to Network Fault Tolerance with InfiniBand." International Conference on Cluster Computing (Cluster), New Orleans.
- O. Villa, S. Krishnamoorthy, J. Nieplocha, D.M. Brown Jr. April 2009. "Scalable transparent checkpoint-restart of global address space applications on virtual machines over infiniband." Conference on Computing Frontiers 2009.
2008
- B. Larkins, J. Dinan, S. Krishnamoorthy, S. Parthasarathy, A. Rountev, P. Sadayappan. November 2008. "Global trees: a framework for linked data structures on distributed memory parallel systems." Supercomputing (SC) 2008.
- J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, and P. Sadayappan. September 2008. "Scioto: a framework for global-view task parallelism." Proceedings of the International Conference on Parallel Processing (ICPP'08).
- J. Nieplocha, S. Krishamoorthy, M. Valiev , M. Krishnan , B. Palmer , and P. Sadayappan. June 2008. "Integrated Data and Task Management for Scientific Applications." Proceedings of the 8th International Conference on Computational Science (ICCS 2008), Krakow, Poland.
2007
- S. Krishnamoorthy, J. P. Canovas, V. Tipparaju, J. Nieplocha, and P. Sadayappan. September 2007. "Non-collective parallel I/O for global address space programming models" Procedings of the International Conference on Cluster Computing (CLUSTER 2007).
2006
- S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan. November 2006. "Hypergraph partitioning for automatic memory hierarchy management." Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006).
- Michael Blocksome, Charles Archer, Todd Inglett, Pat McCarthy, Mike Mundy, Joe Ratterman, Albert Sidelnik, Brian Smith, Gheorghe Almasi, Jose Castanos, Derek Lieber, Jose Moreira, Sriram Krishnamoorthy, and Vinod Tipparaju. November 2006. "Design and implementation of a one-sided communication interface for the IBM eserver blue gene supercomputer." Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006).
- J. Nieplocha, V. Tipparaju, M. Krishnan, and D. Panda. May 2006. "High Performance Remote Memory Access Comunications: The ARMCI Approach." International Journal of High Performance Computing and Applications, Vol 20(2), 233-253p.
- S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha, and P. Sadayappan. May 2006. "Layout transformation support for the disk resident arrays framework." Journal of Supercomputing. vol: 36(2) pp:153-170.
- Sandhya Krishnan, Sriram Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, P. Sadayappan, and Venkatesh Chopella. May 2006. "Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver." Journal of Parallel and Distributed Computing (IPDPS Special Issue) vol:66(5) pp:659-673.
- S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan. April 2006. "An approach to locality-conscious load balancing and transparent memory hierarchy management with a global-address-space parallel programming model." IPDPS Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL 2006).
- S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan. April 2006. "An extensible global address space framework with decoupled task and data abstractions." IPDPS Workshop on Next Generation Software (NGS 2006).
-
Jarek Nieplocha, Bruce Palmer, Vinod Tipparaju, Manojkumar Krishnan, Harold Trease and Edo Apra.
2006.
"Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit."
International Journal of High Performance Computing Applications, Vol. 20, No. 2, 203-231p.
Paper (PDF, 670KB)..
Please refer to journal's website for the abstract and the final formatted manuscript. - Sriram Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, Jarek Nieplocha, and P. Sadayappan. 2006. "Layout transformation support for the Disk Resident Arrays framework." The Journal of Supercomputing." Vol 36, No. 2, pp. 153-170.
- Manojkumar Krishnan and Jarek Nieplocha. 2006. "Memory efficient parallel matrix multiplication operation for irregular problems." In Proceedings of the 3rd conference on Computing frontiers, Ischia, Italy.
2005
- Tipparaju, V. and Nieplocha, J. November 2005. "Optimizing All-to-All Collective Communication by Exploiting Concurrency in Modern Networks." In In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (November 12 - 18, 2005). Conference on High Performance Networking and Computing.
- Nieplocha, J., Tipparaju, V., and Krishnan, M. July 2005. "Optimizing Strided Remote Memory Access Operations on the Quadrics QsNetII Network Interconnect." In Proceedings of the Eighth international Conference on High-Performance Computing in Asia-Pacific Region (November 30 - December 03, 2005). HPCASIA.
- Nieplocha, J., Tipparaju, V., and Apra, E. April 2005. "An evaluation of two implementation strategies for optimizing one-sided atomic reduction." In Proc. of the 19th IEEE International Parallel and Distributed Processing Symposium.
- Manojkumar Krishnan, Yuri Alexeev, Theresa L. Windus, and Jarek Nieplocha. 2005. "Multilevel Parallelism in Computational Chemistry using Common Component Architecture." In Proc. of Supercomputing 2005, Seattle, WA.
- Jarek Nieplocha, Manojkumar Krishnan, Bruce Palmer, Vinod Tipparaju, and Yeliang Zhang. 2005. "Exploiting Processor Groups to Extend Scalability of the GA Shared Memory Programming Model." In Proc. of ACM Computing Frontiers, Italy.
- Jarek Nieplocha, Doug Baxter, Vinod Tipparaju, Craig Edward Rasmussen, Robert W. Numrich. 2005. "Symmetric Data Objects and Remote Memory Access Communication for Fortran-95 Applications". In Euro-Par 2005, 720-729. Paper (PDF, 79KB).
- Manoj Krishnan and Jarek Nieplocha. 2005. "Optimizing Performance on Linux Clusters Using Advanced Communication Protocols: Achieving Over 10 Teraflops on a 8.6 Teraflops Linpack-Rated Linux Cluster." In Proceedings of the 6th International Conference on Linux clusters: The HPC Revolution. Paper (PDF, 194KB).
2004
- V. Tipparaju, G. Santhanaraman, J. Nieplocha, D.K. Panda. April 2004. "Host assisted zero-copy remote memory access communication on Infiniband." In Proc. of IPDPS'04.
- M. Krishnan, J. Nieplocha. April 2004. "SRUMMA: a matrix multiplication algorithm suitable for clusters and scalable shared memory systems." In Proc of IPDPS'04.
- Bernholdt D., J. Nieplocha, P. Sadayappan. 2004. "Raising the level of programming abstraction in scalable programming models. Proc. PPHEC/HCA-10, Madrid, Spain.
2003
- J. Nieplocha,V. Tipparaju, M. Krishnan, G. Santhanaraman, D.K. Panda. December 2003. "Optimizing mechanisms for latency tolerance in remote memory access communication on clusters." In Proc. IEEE Intern. Conf. Cluster Computing. CLUSTER'2003.
- Vinod Tipparaju, Manojkumar Krishnan, Jarek Nieplocha, Gopalakrishnan Santhanaraman, and Dhabaleswar Panda. December 2003. "Exploiting Non-blocking Remote Memory Access Communication in Scientific Benchmarks." In Proc. HiPC'03. Paper (PDF, 263KB).
- Palmer B., J. Nieplocha, E. Apra. 2003. "Shared memory mirroring for reducing communication overhead on commodity networks." In Proc. IEEE CLUSTER'03, Hong Kong.
- J. Nieplocha, J. Ju, Vinod Tipparaju and E. Aprà . April 2003. "One-Sided Communication on Clusters with Myrinet." In Journal of Cluster Computing. Volume 6 Issue 2.
- Vinod Tipparaju, Jarek Nieplocha, Dhabaleswar Panda. April 2003. "Fast collective operations using shared and remote memory access protocols on clusters." In Proceedings of the 17th International Parallel and Distributed Processing Symposium.
- Darius Buntinas, Amina Saify, Dhabaleswar K. Panda, and Jarek Nieplocha. April 2003. "Optimizing synchronization operations for remote memory communication systems." In Proceedings of the 17th International Parallel and Distributed Processing Symposium.
2002
- J. Nieplocha, V. Tipparaju, A. Saify, and D.K. Panda. 2002. "Protocols and Strategies for Optimizing Performance of Remote Memory Operations on Clusters". In Proceedings of the 16th International Parallel and Distributed Processing Symposium.
- Nieplocha J., R. Harrison, M. Kumar, B. Palmer, V. Tipparaju and H. Trease. 2002. "Combining Distributed and Shared Memory Models: Approach and Evolution of the Global Arrays Toolkit." Proc. POOHL'2002 Workshop of ICS-2002, NYC.
- Palmer B., and J. Nieplocha. 2002. "Efficient algorithms for ghost cell updates on two classes of MPP architectures." Proc. PDCS-2002.
2001
- J. Nieplocha, J. Ju, and E. Apra. 2001. One-sided Communication on the Myrinet-based SMP Clusters using the GM Message-Passing Library." In Proceedings of the 15th International Parallel & Distributed Processing Symposium.
2000
- J. Nieplocha, J. Ju, and T.P. Straatsma. 2000. "A Multiprotocol Communication Support for the Global Address Space Programming Model on the IBM SP." In Proceedings from the 6th International Euro-Par Conference on Parallel Processing.
1999
- Jarek Nieplocha, Holger Dachsel, and Ian Foster. 1999. "Implementing noncollective parallel I/O in cluster environments using Active Message communication." In Journal of Cluster Computing, Volume 2 Issue 4.
1998
- Shah G., J. Nieplocha, J. Mirza, C. Kim, R. Harrison, R. Govindaraju, K. Gildea, P. DiNicola, and C. Benderr. 1998. "Performance and Experience with LAPI—a New High-Performance Communication Library for the IBM RS/6000 SP." In Proc. Int. Parallel Proc. Symp. IPPS'98, pages 260-266.
- Holger Dachsel, Jarek Nieplocha, Robert Harrison. 1998. "An out-of-core implementation of the COLUMBUS massively-parallel multireference configuration interaction program." Proc. Supercomputing'98, p. 41.
- Nieplocha, J., Foster, I., and Dachsel, H. 1998. "Distant I/O: one-sided access to secondary storage on remote processors." High Performance Distributed Computing Conference (HPDC-7), pages 148-154. Orlando, FL.
1996
- Nieplocha J. and RJ Harrison. 1996. "Shared memory NUMA programming on I-WAY (Adobe PostScript, 181KB)." Proc. IEEE Symp. HPDC-5, pages 432-441, (HPDC-5 Best Paper Award).
- Jarek Nieplocha and Ian Foster. 1996. "Disk Resident Arrays: An array-oriented I/O library for out-of-core computations." Proc. IEEE Conf. Frontiers of Massively Parallel Computing Frontiers'96, pages 196-204.
- Nieplocha J., RJ Harrison, and RJ Littlefield. 1996. "Global Arrays: A nonuniform memory access programming model for high-performance computers." The Journal of Supercomputing, 10:197-220.
-
Nieplocha J., R.J. Harrison, and I. Foster.
1996.
"Explicit Management of Memory Hierarchy, in Advances in High Performance Computing."
Editors: L. Grandinetti, J. Kowalik, M. Vajtersic.
Kluwer Academic: NATO ASI 3/30: 185-198.
(see NATO ASI Series Catalog).
1995
- Nieplocha J., RJ Littlefield, and M. Rosing. 1995. "Beyond message-passing: A case for one-sided communication in MPI." Proc. 1st MPI Developers Conference.
- Nieplocha J., RJ Harrison, and RJ Littlefield. 1995. "The Global Array programming model for high performance scientific computing." SIAM-News, September.
1994
- Nieplocha J., R.J. Harrison, and R.J. Littlefield. 1994. "Global Arrays: A portable shared memory model for distributed memory computers." Proc. Supercomputing'94, pages 340-349.