What's new

China Dominates the World TOP500 Supercomputers

.
you can quote them and tell them it is the time to deliver their promise.

This is the biggest fun from PDF.

You can't take them seriously. It's a people that is so incompetent, yet so confident, for some mysterious reasons. Sometimes, you would wonder if some political terms such as "freedom and democracy" actually make people insane.
 
. .
What CPUs are used in Chinese supercomputers?


Both systems use native CPU and accelerator architectures

From the article below:
It seems that China already has two exascale supercomputers at once – no one else has such systems, not even the United States.


One is:

"The successor to TaihuLight, codenamed Oceanlite, is based on a new generation of Sunway chips (ShenWei) and has a minimum of 42 million cores and a peak performance of 1.3 Eflops in HPL. "

The other one:

Phytium, developed by Tianjin Phytium Technology.

" Phytium is responsible for their development, whose growth at one time was helped by the ban on the supply of Intel Xeon for Chinese supercomputers. The company has its own 7-nm Arm processors FeiTeng and Matrix Series DSPs. "
 
.
新冠药物筛选效率提升200倍 中国新一代天河超算入围戈登贝尔奖评选

2021-11-05 22:07:49 出处:快科技 作者:宪瑞 编辑:宪瑞 人气: 4390 次 评论(4)

11月5日晚,据央视报道,在天河新一代超级计算机上完成的“基于自由能微扰-绝对结合自由能方法的大规模新冠药物虚拟筛选”工作成功入围2021年度的戈登贝尔新冠特别奖,这是我国首次入围该特别奖奖项。

该成果由中国科学院上海药物所陈凯先院士、北京航空航天大学钱德沛教授和荷兰阿姆斯特丹大学Hans Westerhoff教授提名推荐。

通过利用天河新一代超级计算机的超大规模算力,使用国际领先的药物-靶标结合精准评价计算方法,实现了新冠应急药物(有效降低致死率和重症率)的快速筛选和发现。

团队以新冠病毒的M-Pro和TMPRSS2等重要靶点为研究对象,对来源于美国FDA(食药监局)批准上市及成熟商业库的近180万种小分子与靶点的复合物进行计算评价,在一周内完成了50余万个复杂分子动力学模拟任务,并筛选出98个化合物,其中50个化合物在生物活性测试中表现出较高活性,命中率达到51%,是当前国际最高水平。

计算筛选得到的双嘧达莫药物已完成100多例的临床试验,其结果表明:双嘧达莫干预三周后,新冠重型患者临床症状明显改善,达到87.5%的出院率,明显优于对照组40%出院率,无死亡案例,无转成危重型案例,该药能提高治愈率并降低病亡率。

该药与阿比多尔合用后,新冠普通型患者的出院率为100%,平均出院时间为7天,比阿比多尔单用提前了4天,显示出良好的治疗效果。

根据实际测算,在天河新一代超级计算机的支撑下,基于自由能微扰-绝对结合自由能方法的药物筛选效能提升了200倍。

该项成果对于实现针对突发疫情的快速药物响应,具有极为重要的现实意义。

同时,基于天河新一代超级计算平台,天津超算中心还联合了天津中医药大学张伯礼院士团队开展治疗新冠的中药有效成分筛选工作,开展中药现代化研究。

戈登贝尔奖是国际上高性能计算应用领域的最高学术奖项,被称为“超算领域的诺贝尔奖”,由ACM每年评选和颁发,具有较大的国际影响力。

由于新冠疫情的暴发,ACM于2020年首次设立了戈登贝尔奖新冠特别奖(ACM Gordon Bell Special Prize for HPC-Based COVID-19 Research),以表彰在“超算抗疫”方面取得的杰出研究成果。

其评价的主要标准为:通过高性能计算的应用和创新,对理解疾病本质、控制疾病传播或发现有效治疗手段作出重要贡献。

2020年该奖项有6项成果入围,全部被美国包揽。
Machine translation:
Coronavirus drug screening efficiency increased by 200 times China's new generation of Tianhe supercomputers shortlisted for the Gordon Bell Prize


2021-11-05 22:07:49 From: Fast Technology Author: Xianrui Editor: Xianrui Popularity: 4390 times Comments(4)

On the evening of November 5, CCTV reported that the work of "Virtual Screening of Large-Scale Coronavirus Drugs Based on Free Energy Perturbation-Absolute Binding Free Energy Method" completed on the Tianhe New Generation Supercomputer was successfully shortlisted for the Gordon Bell coronavirus Special Award in 2021, which is the first time that China is shortlisted for this special award.

The work was nominated by Academician Kaixian Chen from the Shanghai Institute of Pharmaceutical Sciences of the Chinese Academy of Sciences, Professor Depei Qian from Beijing University of Aeronautics and Astronautics, and Professor Hans Westerhoff from the University of Amsterdam, the Netherlands.

The rapid screening and discovery of coronavirus emergency drugs (effective in reducing lethality and severe disease rates) was achieved by using the ultra-large scale computing power of the Tianhe next-generation supercomputer and using an internationally leading computational method for combined drug-target precision evaluation.

The team used important targets such as M-Pro and TMPRSS2 of coronavirus as research targets, and computationally evaluated nearly 1.8 million small molecule-target complexes from FDA-approved and mature commercial libraries, and completed more than 500,000 complex molecular dynamics simulation tasks within one week, and screened 98 compounds, among which 50 compounds showed high activity in biological activity tests with a hit rate of 51%, which is the highest level in the world today.

The results of the clinical trials of more than 100 cases of the computationally screened dipyridamole drug have been completed, which showed that after three weeks of dipyridamole intervention, the clinical symptoms of the coronavirus patients improved significantly, reaching a discharge rate of 87.5%, significantly better than the 40% discharge rate of the control group, with no cases of death and no cases of deterioration to critical, and that the drug improved the cure rate and reduced the death rate of the diseased.

After the drug was combined with Abirater, the discharge rate of patients with the coronavirus common type was 100%, and the average discharge time was 7 days, which was 4 days earlier than Abirater alone, showing good treatment effect.

According to the actual measurement, the efficacy of drug screening based on the free-energy perturbation-absolute binding free-energy method was improved by 200 times with the support of the Tianhe next-generation supercomputer.

The results are of great practical importance for achieving rapid drug response to unexpected outbreaks.

Meanwhile, based on the Tianhe new generation supercomputing platform, the Tianjin Supercomputing Center has also joined hands with the team of academician Zhang Boli of Tianjin University of Traditional Chinese Medicine to carry out the screening of active ingredients of traditional Chinese medicine for the treatment of coronavirus and to conduct research on the modernization of traditional Chinese medicine.

The Gordon Bell Award is the highest international academic award in the field of high-performance computing, known as the "Nobel Prize in supercomputing", selected and awarded by the ACM every year, with a large international influence.

The ACM Gordon Bell Special Prize for HPC-Based COVID-19 Research was established for the first time in 2020 due to the outbreak of the coronavirus epidemic to recognize outstanding research results in "supercomputing against the epidemic". The main criteria for evaluation are

The main criterion for the award is that the application of high performance computing and innovation has made a significant contribution to understanding the nature of disease, controlling the spread of disease, or discovering effective treatments.

In 2020, there were six finalists for this award, all of which were from the United States.

Translated with www.DeepL.com/Translator (free version)
 
Last edited:
.
The Nobel of HPC, 2021 Gordon Prize is out.

2021 ACM Gordon Bell Prize Awarded to Team for Achieving Real-Time Simulation of Random Quantum Circuit

ACM, the Association for Computing Machinery, named a 14-member team, drawn from Chinese institutions, recipients of the 2021 ACM Gordon Bell Prize for their project, Closing the "Quantum Supremacy" Gap: Achieving Real-Time Simulation of a Random Quantum Circuit Using a New Sunway Supercomputer.

The members of the winning team are: Yong (Alexander) Liu, Xin (Lucy) Liu, Fang (Nancy) Li, Yuling Yang, Jiawei Song, Pengpeng Zhao, Zhen Wang, Dajia Peng, and Huarong Chen of Zhejiang Lab, Hangzhou and the National Supercomputing Center in Wuxi; Haohuan Fu and Dexun Chen of Tsinghua University, Beijing, and the National Supercomputing Center in Wuxi; Wenzhao Wu of the National Supercomputing Center in Wuxi; and Heliang Huang and Chu Guo of the Shanghai Research Center for Quantum Sciences.

Quantum supremacy is a term used to denote the point at which a quantum device can solve a problem that no classical computer can solve in a reasonable amount of time. Teams at Google and the University of Science and Technology of China in Hefei both claim to have developed devices that have achieved quantum supremacy.

According to the Gordon Bell Prize recipients, determining whether a device has achieved quantum supremacy for a given task (in a specific scenario) begins with sampling the interactions of the different quantum bits (qubits) in a random quantum circuit (RQC). As the number of possible interactions among qubits in a random quantum circuit is staggeringly large, simulating their interactions is a problem well-suited for a high-performance computer. However, the quantum physics behind the entangled qubits requires that the classical binary bits used in a supercomputer store and compute the information with exponentially-increasing complexity.

In their Gordon Bell Prize-winning work, the Chinese researchers introduced a systematic design process that covers the algorithm, parallelization, and architecture required for the simulation. Using a new Sunway Supercomputer, the Chinese team effectively simulated a 10x10x (1+40+1) random quantum circuit (a new milestone for classical simulation of RQC). Their simulation achieved a performance of 1.2 Eflops (one quintillion floating-point operations per second) single-precision, or 4.4 Eflops mixed-precision, using over 41.9 million Sunway cores (processors).

The project far outpaced state-of-the-art approaches to simulating an RQC. For example, the most recent effort, using the Summit supercomputer to simulate a random quantum circuit of the Google Sycamore quantum processor (which has 53 qubits), was estimated to take 10,000 years to perform. By contrast, the Chinese team’s approach employing the Sunway supercomputer takes only 304 seconds for a simulation of similar quantum complexity.

The Chinese team explained that they undertook this challenge because achieving real-time simulation of an RQC using a supercomputer would aid both in the development of quantum devices and in bringing algorithmic and architectural innovations within the traditional supercomputing community.

The ACM Gordon Bell Prize tracks the progress of parallel computing and rewards innovation in applying high performance computing to challenges in science, engineering, and large-scale data analytics. The award was presented today by former ACM President Cherri M. Pancake and Professor Mark Parsons, Chair of the 2021 Gordon Bell Prize Award Committee, during the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21), which was held in St. Louis, Missouri, and virtually for those who could not attend.
 
.

Beside the grand prize above, two other notable Chinese achievement,
ACM Gordon Bell Prize nominees
While the quantum simulation research is taking home the prize, the other five nominees represent some of the most intensive research for some of the most pressing research applications in the world. Brief descriptions are included below; follow the links to their respective papers to learn more about each of the teams’ remarkable work.

Symplectic Structure-Preserving Particle-in-Cell Whole-Volume Simulation of Tokamak Plasmas to 111.3 Trillion Particles and 25.7 Billion Grids
If you don’t yet know what a tokamak is, just know that they might save the world: tokamaks use magnetism to trap plasma for the production of fusion energy. However, tokamaks are notoriously delicate and unstable, hence the current infeasibility of productive fusion energy. The HPC sector is working to change that: these dozen researchers from China, also using the new Sunway system, simulated the whole-volume confinement toroidal plasmas of a tokamak. These simulations reached up to 111.3 trillion particles and 25.7 billion grids, achieving sustained performance in excess of 201 petaflops double-precision, with the fastest iteration step hitting 298.2. To learn more, read the paper here.​
Extreme-Scale Ab Initio Quantum Raman Spectra Simulations on the Leadership HPC System in China
This research, also leveraging the new Sunway exascale system, pushed Raman spectroscopy – a kind of structural fingerprinting – to new limits. “Raman spectroscopy,” these dozen researchers from China explain, “provides chemical and compositional information that can serve as a structural fingerprint for various materials. Therefore, simulations of Raman spectra, including both quantum perturbation analyses and ground-state calculations, are of significant interest.” Full quantum mechanical simulations of Raman spectra for biological materials have proved particularly difficult, and here, the researchers conduct “fast, accurate, massively parallel full ab initio simulations of the Raman spectra of realistic biological systems” up to 3,006 atoms, achieving up to 468.5 petaflops in double-precision and 813.7 petaflops in mixed-half-precision and indicating “the potential for new applications of the QM approach to biological systems.” To learn more, read the paper here.​

Also notable is China finalist for special Gordon Bell Prize for COVID-19 Research -
Description: As a theoretically rigorous and accurate method, FEP-ABFE (Free Energy Perturbation-Absolute Binding Free Energy) calculations showed great potential in drug discovery, but its practical application was difficult due to high computational cost. To rapidly discover antiviral drugs targeting SARS-CoV-2 Mpro and TMPRSS2, we performed FEP-ABFE-based virtual screening for ∼12,000 protein-ligand binding systems on a new generation of Tianhe supercomputer. A task management tool was specifically developed for automating the whole process involving more than 0.5 million MD tasks. In further experimental validation, 50 out of 98 tested compounds showed significant inhibitory activity towards Mpro, and one representative inhibitor, dipyridamole, showed remarkable outcomes in subsequent clinical trials. This work not only demonstrates the potential of FEP-ABFE in drug discovery, but also provides an excellent starting point for further development of anti-SARS-CoV-2 drugs. Besides, ∼500 TB of data generated in this work will also accelerate the further development of FEP-related methods.​
 
. .
China issues 10 application challenges for new generation supercomputer
Source: Xinhua| 2021-12-10 18:44:42|Editor: huaxia

BEIJING, Dec. 10 (Xinhua) -- China has issued a list of 10 application challenges for its new generation supercomputer, with an aim to build a quintillion-scale supercomputing application ecology, Science and Technology Daily reported on Friday.

The list includes the fusion simulation of a magnetic confinement fusion reactor, fluid mechanics simulation of a full-size aerospace vehicle, dynamic simulation of a digital cell atomic system, as well as refined numerical weather forecasting.

It also covers efficient and high-throughput virtual drug screening, a super-scale artificial intelligence pre-training model, and high-resolution sky survey image processing for Five-hundred-meter Aperture Spherical Radio Telescope (FAST) observation data.

The application challenges also include global seismic full waveform inversion, whole-brain neuron dynamic simulation, and sub-mesoscale global ocean numerical simulation in full resolution.

The 10 application challenges for the new generation supercomputer, which is capable of processing one quintillion calculations per second, were jointly issued by China's National Supercomputer Center in Tianjin and dozens of other research teams on Wednesday, according to the newspaper.

The center's chief scientist of supercomputer application research and development Meng Xiangfei said with these applications in place in the future, supercomputers will continue to play an important role in driving high-quality development. Enditem
 
. .
China used to be open and publish our advancement annually, until the US got freaked out at us beating them a few times. Now we don't even tell the world when we have an exascale beast. These buggers might ban lenovo. Lolol
 
.
Scaling graph traversal to 281 trillion edges with 40 million cores | Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
1649254876140.png

ABSTRACT
Graph processing, especially high-performance graph traversal, plays a more and more important role in data analytics. The successor of Sunway TaihuLight, New Sunway, is equipped with nearly 10 PB memory and over 40 million cores, which brings the opportunity to process hundreds of trillions of edges graphs. However, the graph with an unprecedented scale also brings severe performance challenges, including load imbalance, poor locality, and irregular access of graph traversal workload.

To address the scalability problem, we propose a novel 3-level degree-aware 1.5D graph partitioning, which benefits from both delegated 1D and 2D partitioning. By delegating extremely heavy vertices globally and other heavy vertices on columns and rows in the processes mesh, we break the scalability wall of previous partitioning methods. Together with sub-iteration direction optimization, core group -aware core subgraph segmenting, and a new on-chip sorting mechanism using RMA, we achieve 180,792 GTEPS on a graph with 281 trillion edges, using 103,912 processors with over 40 million cores, achieving 1.75X performance and 8X capacity compared to the previous state of the art and conforming to the Graph 500 BFS benchmark[14].

1649255033105.png
 
.
BaGuaLu | Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
1649255863895.png
ABSTRACT
Large-scale pretrained AI models have shown state-of-the-art accuracy in a series of important applications. As the size of pretrained AI models grows dramatically each year in an effort to achieve higher accuracy, training such models requires massive computing and memory capabilities, which accelerates the convergence of AI and HPC. However, there are still gaps in deploying AI applications on HPC systems, which need application and system co-design based on specific hardware features.
To this end, this paper proposes BaGuaLu1, the first work targeting training brain scale models on an entire exascale supercomputer, the New Generation Sunway Supercomputer. By combining hardware-specific intra-node optimization and hybrid parallel strategies, BaGuaLu enables decent performance and scalability on unprecedentedly large models. The evaluation shows that BaGuaLu can train 14.5-trillion-parameter models with a performance of over 1 EFLOPS using mixed-precision and has the capability to train 174-trillion-parameter models, which rivals the number of synapses in a human brain.
 
.
Establishing a non-hydrostatic global atmospheric modeling system at 3-km horizontal resolution with aerosol feedbacks on the Sunway supercomputer of China - ScienceDirect
1649256315042.png

Abstract​

During the era of global warming and highly urbanized development, extreme and high impact weather as well as air pollution incidents influence everyday life and might even cause the incalculable loss of life and property. Although, with the vast development of atmospheric model, there still exists substantial numerical forecast biases objectively. To predict accurately extreme weather, severe air pollution, and abrupt climate change, the numerical atmospheric model requires not only to simulate meteorology and atmospheric compositions simultaneously involving many sophisticated physical and chemical processes but also at high spatiotemporal resolution. Global integrated atmospheric simulation at spatial resolutions of a few kilometers remains challenging due to its intensive computational and input/output (I/O) requirement. Through multi-dimension-parallelism structuring, aggressive and finer-grained optimizing, manual vectorizing, and parallelized I/O fragmenting, an integrated Atmospheric Model Across Scales (iAMAS) was established on the new Sunway supercomputer platform to significantly increase the computational efficiency and reduce the I/O cost. The global 3-km atmospheric simulation for meteorology with online integrated aerosol feedbacks with iAMAS was scaled to 39,000,000 processor cores and achieved the speed of 0.82 simulation day per hour (SDPH) with routine I/O, which enabled us to perform 5-day global weather forecast at 3-km horizontal resolution with online natural aerosol impacts. The results demonstrate the promising future that the increasing of spatial resolution to a few kilometers with online integrated aerosol feedbacks may significantly improve the global weather forecast.

1649256399720.png
 
. .
Back
Top Bottom