References

vestifm

Известия Национальной академии наук Беларуси. Серия физико-математических наук

Proceedings of the National Academy of Sciences of Belarus. Physics and Mathematics Series

1561-24302524-2415

The Republican Unitary Enterprise Publishing House "Belaruskaya Navuka"

10.29235/1561-2430-2018-54-4-417-426

vestifm-348

Research Article

МАТЕМАТИКА

MATHEMATICS

Построение двумерных зернистых параллельных вычислительных процессов

Tiled parallel 2D computational processes

Лиходед

Н. А.

Likhoded

N. A.

доктор физикоматематических наук, профессор кафедры вычислительной математики факультета прикладной математики и информатики

D. Sc. (Physics and Mathematics), Professor of the Department of Applied Mathematics

likhoded@bsu.by

Полещук

М. А.

Paliashchuk

M. A.

ассистент кафедры вычислительной математики факультета прикладной математики и информатики

Junior Researcher

poleschuma@bsu.by

Белорусский государственный университет, МинскBelarusian State University, Minsk

2018

09012019

544417426

2019

Лиходед Н.А., Полещук М.А.

Likhoded N.A., Paliashchuk M.A.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://vestifm.belnauka.by/jour/article/view/348

Алгоритм, реализуемый на параллельном компьютере с распределенной памятью, имеет, как правило, зернистую структуру: множество операций разбито на подмножества, называемые зернами вычислений. Одним из современных подходов к получению зернистых вариантов алгоритмов является тайлинг – преобразование, основанное на информационных разрезах итерационного пространства, в результате которого получаются макрооперации-тайлы. Операции одного тайла выполняются атомарно, как одна единица вычислений, а обмен данными происходит массивами. В настоящей работе для алгоритмов, заданных вложенными многомерными циклами, предложен способ построения зернистых вычислительных процессов, логически организованных в двумерную структуру. По сравнению с одномерными структурами, использование двумерных структур возможно в меньшем числе случаев, но может иметь преимущества при реализации алгоритмов на параллельных компьютерах с распределенной памятью. К числу возможных преимуществ относятся уменьшение объема коммуникационных операций, уменьшение разгона и торможения вычислений, потенциально большее число вычислительных процессов, организация обменных операций только в пределах строк или столбцов процессов. Представленные исследования обобщают на случай двумерной структуры некоторые аспекты метода построения параллельных вычислительных процессов, организованных в одномерную структуру. В частности, исследована возможность организовать полностью загруженные работой параллельные вычислительные процессы. Показано, что при определенных ограничениях на структуру и длину циклов достаточно произвести тайлинг по трем координатам многомерного итерационного пространства. В более ранних теоретических исследованиях параллельность зернистых вычислений гарантировалась при наличии информационных разрезов по всем координатам итерационного пространства, а для более простого случая одномерной структуры – по двум координатам.

The algorithm implemented on a parallel computer with distributed memory has, as a rule, a tiled structure: a set of operations is divided into subsets, called tiles. One of the modern approaches to obtaining tiled versions of algorithms is a tiling transformation based on information sections of the iteration space, resulting in macro-operations (tiles). The operations of one tile are performed atomically, as one unit of calculation, and the data exchange is done by arrays. The method of construction of tiled computational processes logically organized as a two-dimensional structure for algorithms given by multidimensional loops is stated. Compared to one-dimensional structures, the use of two-dimensional structures is possible in a smaller number of cases, but it can have advantages when implementing algorithms on parallel computers with distributed memory. Among the possible advantages are the reduction of the volume of communication operations, the reduction of acceleration and deceleration of computations, potentially a greater number of computation processes and the organization of data exchange operations only within the rows or columns of processes. The results are a generalization of some aspects of the method of construction of parallel computational processes organized in a one-dimensional structure to the case of a two-dimensional structure. It is shown that under certain restrictions on the structure and length of loops, it is sufficient to perform tiling on three coordinates of a multidimensional iteration space. In the earlier theoretical studies, the parallelism of tiled computations was guaranteed in the presence of information sections in all coordinates of the iteration space, and for a simpler case of a one-dimensional structure, in two coordinates.

параллельные вычисленияраспараллеливание алгоритмовпараллельный компьютер с распределенной памятьюуменьшение числа обменов данными

parallel computationsparallelization of algorithmsdistributed memory parallel computerdata exchange reduction

References1

Xue, J. Time-minimal tiling when rise is larger than zero / J. Xue, W. Cai // Parallel Computing. – 2002. – Vol. 28, №. 6. – P. 915–939. https://doi.org/10.1016/s0167-8191(02)00098-4

Xue J., Cai W. Time-minimal tiling when rise is larger than zero. Parallel Computing, 2002, vol. 28, no. 6, pp. 915–939. https://doi.org/10.1016/s0167-8191(02)00098-4

Kim, D. Parameterized tiling for imperfectly nested loops / D. Kim, S. Rajopadhye // Technical Report CS-09-101. – Colorado State University, Department of Computer Science, February 2009. – 21 p.

Kim D., Rajopadhye S. Parameterized Tiling for Imperfectly Nested Loops. Technical Report CS-09-101. Colorado State University, Department of Computer Science, February 2009. 21 p.

Dathathri, R. Compiling Afﬁne Loop Nests for a Dynamic Scheduling Runtime on Shared and Distributed Memory / R. Dathathri, R. T. Mullapudi, U. Bondhugula // ACM Transactions on Parallel Computing (TOPC). – 2016. – Vol. 3, №. 2. – P. 1–28. https://doi.org/10.1145/2948975

Dathathri R., Mullapudi R. T., Bondhugula U. Compiling Afﬁne Loop Nests for a Dynamic Scheduling Runtime on Shared and Distributed Memory. ACM Transactions on Parallel Computing (TOPC), 2016, vol. 3, no. 2, pp. 1–28. https://doi.org/10.1145/ 2948975

Лиходед, Н. А. Параллельные последовательности зернистых вычислений / Н. А. Лиходед, A. А. Толстиков // Докл. Нац. акад. наук Беларуси. – 2010. – Т. 54, № 4. – С. 36–41.

Likhoded N. A., Tolstikov A. A. Parallel sequences of grain computations. Doklady Natsional’noi akademii nauk Belarusi = Doklady of the National Academy of Sciences of Belarus, 2010, vol. 54, no. 4, pp. 36–41 (in Russian).

Толстиков, А. А. Корректность разбиений алгоритмов при организации зернистых параллельных вычислительных процессов / А. А. Толстиков, Н. А. Лиходед // Междунар. конгресс по информатике: информационные системы и технологии CSIST’2011, 31 окт. – 3 нояб. 2011 г. – Минск: БГУ, 2011. – Т. 2. – С. 122–126.

Tolstikov A. A., Likhoded N. A. Correctness of tiling of algorithms for organization of tiled parallel computational processes. Mezhdunarodnyi kongress po informatike: informatsionnye sistemy i tekhnologii CSIST’2011, 31 oktyabrya – 3 noyabrya 2011 g. T. 2 [International Congress on Computer Science: Information Systems and Technologies CSIST’2011, November 2011. Vol. 2]. Minsk, Belarusian State University, 2011, pp. 122–126 (in Russian).

Воеводин, В. В. Параллельные вычисления / В. В. Воеводин, Вл. В. Воеводин. – СПб.: БХВ-Петербург, 2002. – 608 с.

Voevodin V. V., Voevodin Vl. V. Parallel Computations. St. Petersburg, BKhV-Petersburg Publ., 2002. 608 p. (in Russian).

Соболевский, П. И. Двухуровневый тайлинг и его применение при пространственно-временном отображении алгоритмов на параллельные архитектуры / П. И. Соболевский, С. В. Баханович // Вес. Нац. акад. навук Беларусi. Сер. фiз.-мат. навук. – 2016. –№ 2. – С. 85–97.

Sobolevsky P. I., Bakhanovich S. V. Two-level tiling and its application in the space-time mapping of algorithms onto parallel architectures. Vestsі Natsyianal’nai akademіі navuk Belarusі. Seryia fіzіka-matematychnykh navuk = Proceedings of the National Academy of Sciences of Belarus. Physics and Mathematics series, 2016, no. 2, pp. 85–97 (in Russian).

Parameterized tiled loops for free / L. Renganarayanan [et al.] // ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, California, USA, June 2007. – [S. l.], 2007. – P. 126–138. https://doi.org/ 10.1145/1250734.1250780

Renganarayanan L., Kim D., Rajopadhye S., Strout M. M. Parameterized tiled loops for free. Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation - PLDI ‘07. San Diego, California, USA, June 2007, pp. 126–138. https://doi.org/10.1145/1250734.1250780

Самарский, А. А. Методы решения сеточных уравнений / А. А. Самарский, Е. С. Николаев. – М.: Наука, 1978. – 592 с.

Samarskii A. A., Nikolaev E. S. Methods for Solving of the Grid Equations. Moscow, Nauka Publ., 1978. 592 p. (in Russian).

The authors declare that there are no conflicts of interest present.