Supercomputing in India
A success story
R. RAMACHANDRAN
C-DAC’s efforts in the strategic and economically important area of HPC have put India on the supercomputing map of the world.
PARAM Yuva is the latest in the series of C-DAC’s supercomputers.
IN India, the name C-DAC (Centre for Development of Advanced Computing) has become synonymous with supercomputing, or High Performance Computing (HPC), though the phrase “Advanced Computing” could, in principle, denote any computing environment that makes use of advanced tools, both hardware and software, not necessarily for high-speed number crunching. The reason for that lies in the organisation’s history.
The country, faced with a technology-denial regime that denied its scientific community access to supercomputers, in particular Cray systems, set up C-DAC in March 1988 with the clear mandate to develop an HPC system to meet high-speed computational needs in solving scientific and other developmental problems where fast number crunching is a major component. Following a specific recommendation of the Science Advisory Council to the Prime Minister (SAC-PM) to that effect, C-DAC was established as a scientific society of the then Department of Electronics (now the Department of Information Technology (DIT) under the Ministry of Communications and Information Technology).
Essentially an R&D organisation, C-DAC achieved its primary objective of developing a supercomputer with a capability of one giga, or one billion, floating point operations a second (1 Gflops) in the early 1990s. Christened PARAM 8000, it set the platform for a whole series of parallel computers, called the PARAM series, of HPC systems over the years, with PARAM 20000, or PARAM Padma, breaking the teraflop (thousand billion flops) barrier in 2002 with a peak speed of 1 Tflop.
The latest in the series is called PARAM Yuva, which was developed last year and was ranked 68th in the TOP500 list released in November 2008 at the Supercomputing Conference in Austin, Texas, United States. “The system,” according to C-DAC scientists, “is an intermediate milestone of C-DAC’s HPC road map towards achieving petaflops (million billion flops) computing speed by 2012” (see chart).
As part of this, C-DAC has also set up a National PARAM Supercomputing Facility (NPSF) in Pune, where C-DAC is headquartered, to allow researchers access to HPC systems to address their computer-intensive problems. C-DAC’s efforts in this strategically and economically important area have thus put India on the supercomputing map of the world along with select developed nations of the world.
As of 2008, 52 PARAM systems have been deployed in the country and abroad, eight of them at locations in Russia, Singapore, Germany and Canada.
The PARAM series of cluster computing systems is based on what is called OpenFrame Architecture. PARAM Yuva, in particular,
uses a high-speed 10 gigabits per second (Gbps) system area network called PARAM Net-3, developed indigenously by C-DAC over the last three years, as the primary interconnect. This HPC cluster system is built with nodes designed around state-of-the-art architecture known as X-86 based on Quad Core processors. In all, PARAM Yuva, in its complete configuration, has 4,608 cores of Intel Xeon 73XX processors called Tigerton with a clock speed of 2.93 gigahertz (GHz). The system has a sustained performance of 37.8 Tflops and a peak speed of 54 Tflops.
A novel feature of PARAM Yuva is its reconfigurable computing (RC) capability, which is an innovative way of speeding up HPC applications by dynamically configuring hardware to a suite of algorithms or applications run on PARAM Yuva for the first time. The RC hardware essentially uses acceleration cards as external add-ons to boost speed significantly while saving on power and space.
C-DAC is one of the first organisations to bring the concept of reconfigurable hardware resources to the country. C-DAC has not only implemented the latest RC hardware, it has also developed system software and hardware libraries to achieve appropriate accelerations in performance.
As C-DAC has been scaling different milestones in HPC hardware, it has also been developing HPC application software, providing end-to-end solutions in an HPC environment to different end-users on mission mode. Only in
early January, C-DAC set up a supercomputing facility around a scaled-down version of PARAM Yuva at North-Eastern Hill University (NEHU) in Shillong complete with all allied C-DAC technology components and application software.
Diverse ventures
But not so well known to the general public is that C-DAC’s activities are not restricted to the domain of HPC alone. In fact, having fulfilled its primary goal, C-DAC broadened its spectrum of activities to give true meaning to the phrase Advanced Computing embedded in its name.
Over the years, the centre has diversified into a host of IT-enabled technologies, products and services. It has, in fact, been a pioneer in some of the areas which, by their very nature of a national developmental perspective, were unlikely to have been taken up by any private player in the IT market.
While its core or cutting-edge technology areas include besides HPC, grid computing, language technologies and multilingual computing, software technologies including free and open software solutions (FOSS), very large system integration (VLSI) and embedded and real-time systems (RTS), these two decades of innovation have also seen C-DAC venture into such areas as e-governance, cyber security and cyber forensics, professional electronics, Area Traffic Control System (ATCS) and health informatics. Grid computing and bundled open source operating systems for use in the Indian context are the most recent of C-DAC’s successful initiatives. In addition, education and training form an important component of C-DAC’s activities.
C-DAC’s foray into these diverse fields has resulted in several enabling technologies and related products and services, which have been transferred and deployed in key sectors of the economy such as science and engineering, power, defence, health care, agriculture, industrial control, broadcasting, entertainment, education and democratic governance.
Today, its vision is “to emerge as the premier R&D institution for the design, development and deployment of world-class IT solutions for economic and human advancement”. C-DAC is already in discussion with the government to set up a separate commercialisation and marketing arm for its diverse products, solutions and services even as it is exploring the possibility of spawning a company.
As C-DAC evolved and developed in-house skills and expertise in diverse fields, several institutions under the Ministry across the country, which are carrying out some niche tasks, have been merged with C-DAC to create a mega R&D institution with synergies across several IT-related disciplines. As C-DAC Director General S. Ramakrishnan says, this enlarged portfolio of institutions has resulted in a much broader skill base and a large geographic footprint for the organisation. C-DAC now has 11 R&D centres, which are located in Pune, Bangalore (Bangalore Knowledge Park, Bangalore Electronics City), Chennai, Hyderabad, Kolkata, Mohali, Mumbai, New Delhi, Noida and Thiruvananthapuram, and the skill base consists of over 2,500 members.
Important among HPC-related developments in C-DAC is the
national computing grid initiative called Garuda, which is a collaboration of science researchers and experimenters on a nationwide grid of computational nodes, mass storage and scientific instruments that aims to provide technological advances required to enable data- and computation-intensive research. The Garuda grid has already completed its proof of concept (PoC) phase, and in the foundation phase more applications and new technology and architecture such as service-oriented architecture (SOA) in grid computing will be tested and validated.
Garuda connects 45 institutions across 17 cities, with a peak capacity of 2.43 Gbps. This network is seen as the precursor to the next generation gigabit speed nation-wide area network with HPC resources and scientific instruments for seamless collaborative research and experimentation.
One of the major challenges in Garuda has been the deployment of appropriate tools and middleware to enable applications to run seamlessly across the grid. Towards this and related requirements, C-DAC has initiated research in Semantic Grid Services, Mobile Agents, Integrated Development Environments, Network Simulation and Grid File Systems. These initiatives are being carried out both internally and in collaboration with institutions such as the Indian Institute of Technology Madras; the Space Applications Centre (SAC), Ahmedabad; and the Indian Institute of Science (IISc), Bangalore. C-DAC is also collaborating in the European Union-India grid project, which will allow researchers in the E.U. and India to carry out research over the European Enabling Grids for E-SciencE (EGEE) and Garuda.
Garuda initiative
The Garuda initiative is a demonstration of future directions in adopting front-line grid technologies in the country. For instance, overlaying Garuda over the multi-gigabit National Knowledge Network (NKN) will have a tremendous impact on collaborative research between educational institutions, particularly universities, across the country. A significant development was the recent launch of the Indian Grid Certification Authority (IGCA), a certification authority for computational grids under the aegis of C-DAC, which has now been accredited by the Asia-Pacific Grid Policy Management Authority (APGrid PMA). This enables Indian researchers to access worldwide grids, not just the bilaterally enabled EGEE. The IGCA will address security issues of Indian grids and the interoperability between Indian and international grids.
C-DAC’s language technology mission was initiated to create a framework to support various living languages with diverse scripts on standard computers.
C-DAC’s innovation in language technologies began with its widely acclaimed Graphics and Intelligence based Script Technology (GIST), whose inventor initiated its development at IIT Kanpur and later joined the ranks of C-DAC in the early 1990s. In fact, this led to the creation of a GIST group within C-DAC, which developed several applications using GIST. The technology was extended to include multimedia and multilingual computing solutions, covering applications such as publishing and printing, word processing, office automation suites with language interfaces for popular third party software on various operating platforms, electronic mail, natural language processing and artificial intelligence-based machine-aided language learning and translation.
C-DAC’s continuing efforts in this field have resulted in appropriate tools for today’s context. The most significant among them is perhaps a cross-language search engine called G-CLASS (GIST cross language search plug-ins suite).
Search engines are mainly statistical in nature and suffer from certain lacunae because of which multiple querying is often needed. In the case of Indian languages, the problem is even more acute. G-CLASS addresses these problems successfully. Apart from providing search engine developers with behind-the-scenes solutions such as conversion of legacy data to Unicode and also identifying languages that use the same script, such as Hindi and Marathi, G-CLASS enhances search capabilities by providing a suite of linguistic tools, such as multilingual searches and synonymic search.
At present, the searches that G-CLASS enables are restricted to Hindi, Marathi, Gujarati and Oriya. Bengali, Malayalam and Punjabi are under development. Tamil, Konkani and Kannada and the remaining official languages are expected to follow. C-DAC has also developed Indian language tools and solutions, which are free and can be downloaded and are available on CDs, for almost all platforms including desktops, mobile, television, websites and e-governance applications. These have been launched for 11 languages and already about 6 lakh CDs have been distributed and nearly 30 lakh users have downloaded them.
The platform set by earlier language technology activities has led to the Applied Artificial Intelligence (AAI) group at C-DAC developing some fundamental and innovative applications in the field of natural language processing, including machine translation, information extraction/retrieval, text summarisation, automatic speech recognition, text-to-speech synthesis, intelligent language teaching and natural language-based document management with Decision Support Systems.
The MANTRA Rajya Sabha tool is one of its major achievements. Using it, immediate translations of the English text in the domain of the Rajya Sabha’s list of business, papers laid on the Table, bulletins and debate synopses can be obtained. The translation strategy adopted in MANTRA Rajya Sabha is neither “word to word” nor “rule to rule” but “lexical tree to lexical tree”.
Following MANTRA’s successful demonstration, the Department of Official Languages has sponsored a project named “Computer Assisted Translation System for Administrative Purposes” for the specific domain of gazette notifications. This package has been named MANTRA-Rajbhasha.
Recent initiatives in the area of open source software,
another of C-DAC’s areas of proven competence, have resulted in the successful development of Bharat Operating Systems Solutions (BOSS). To develop open source software in the country and to address issues in the Indian context, the DIT has launched the National Resource Centre for Free and Open Source Software (NRCFOSS). One of its main objectives was to develop an Indian version of the GNU/Linux operating system. BOSS Linux is simpler to install than many of the normal Linux packages.
The first edition of BOSS took about five months to be packaged and released. C-DAC proposes to release new versions once or twice a year. A BOSS development repository is being prepared to make it easier for developers to build other specialised Linux distributions based on BOSS Linux.
C-DAC is playing a front-line role in developing appropriate tools and software for use by medical professionals and for day-to-day home remedies. Its R&D efforts in these areas have resulted in many deployable systems and solutions for telemedicine (Mercury and Sanjeevani) and tele-education, hospital information systems, tele-oncology, a decision support system for Ayurveda and software development kits for international health care standards.
Among C-DAC’s health care initiatives, ONCONET, the first implementation of a tele-oncology system in India, in Kerala, deserves particular mention.
It is a comprehensive telemedicine solution, which has established a knowledge-enabled oncology network connecting the speciality hospital at the Regional Cancer Centre, Thiruvananthapuram, with remote hospitals at various other places (Kannur, Kollam, Kochi, Palakkad and Kozhenchery) in Kerala. A web-enabled hospital information system called TEJHAS was also developed and integrated with ONCONET.
C-DAC also provides an appropriate environment for research and academic pursuits. For instance, C-DAC has in-house research expertise in the complex discipline of climate and weather modelling and forecasting. It has developed a Real Time Weather System (RTWS) called Anuman, a fully automated flexible, portable, web-based software for simulations of weather. Anuman’s SOA is capable of predicting high-impact events.