GridWorld is held on Washington DC, just in our neighborhood. It is fascinating for us to take the tutorials, hand-on sessions and learn something new in the high performance computing field.
The very first tutorial, The Grid “Ecosystem” is Grid computing 101, the speaker, Lee Liming, introduced the concepts of Grid Computer, then gave a very detailed overview of the globus toolkit.
Since grid is distributed geographically, security is No.1 concern for grid administration and management. We spent about 15 minutes discussing the certification of authority(CA), MyProxy( a delegation of CA service).
The core functionality is still handled by traditional job management system(JMS), but with grid-aware feature. Additionally, Globus Resource Allocation Manager(GRAM) acts as the front-end for JMS to handle the authentication.
MPICH-G2 is ready. Not quite sure other GAS/PGAS libraries or languages.
Unlike the normal FTP service, Grid suffers from that the traditional remote copy tools can not take the full advantage of high-speed connection. GridFTP is introduced to utilize multiple data channels to fast the file transfer, more aggressive approach is to explore the striped data distribution by using Striped GridFTP.
He also demonstrated a success story, NSF’s TeraGrid for the case study.
Grid is characterized by the non-uniform memory access(NUMA) in my mind. The communication overhead between the clusters is very expensive, and unreliable. How could we distribute the workload to adapt the topology of the grid? Suppose the following scenario:
The job can be modularized as i, j, k; there are huge intra-module communication and little inter-module communication. The grid consist at least three clusters, A, B, C. We expect to schedule each module to one cluster to eliminate the wasteful communication overhead. Does Globus is smart enough to discover the communication pattern, or should the developers need to explicitly configure the batch manually?