Proceedings of the 15th international conference on Supercomputing

ICS '01: Proceedings of the 15th international conference on Supercomputing

June 2001

2001 Proceeding

Chairmen:
Mario Mango Furnari
Istituto di Cibernetica, CNR, Italy
,
Efstratios Gallopoulos
Univ. of Patras

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

ICS01: 15th International Conference on Supercomputing Sorrento Italy

ISBN:

978-1-58113-410-0

Published:

17 June 2001

Sponsors:

SIGARCH

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Bibliometrics

Citation count

948

Downloads (6 weeks)

Downloads (12 months)

179

Downloads (cumulative)

23,524

Sections

ICS '01: Proceedings of the 15th international conference on Supercomputing

2001

Previous Next

Abstract

No abstract available.

Select All

Export Citations Save to Binder

Article

Analytical cache models with applications to cache partitioning

G. Edward Suh,
Srinivas Devadas,
Larry Rudolph

pp 1–12https://doi.org/10.1145/377792.377797

An accurate, tractable, analytic cache model for time-shared systems is presented, which estimates the overall cache miss-rate of a multiprocessing system with any cache size and time quanta. The input to the model consists of the isolated miss-rate ...

- 105
- 911
Metrics
Total Citations105
Total Downloads911
Last 12 Months5
Last 6 weeks1

Abstract
Get Access

Article

A synthesis of memory mechanisms for distributed architectures

Jiajing Zhu,
Jay Hoeflinger,
David Padua

pp 13–22https://doi.org/10.1145/377792.377799

Producing efficient parallel programs for distributed memory multiprocessors is a difficult task. Hand-coding efficient parallel programs for these systems can be extremely difficult, time consuming and error-prone, so people have turned to the shared ...

- 6
- 723
Metrics
Total Citations6
Total Downloads723
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

The trade-off between implicit and explicit data distribution in shared-memory programming paradigms

Dimitrios S. Nikolopoulos,
Eduard Ayguadé,
Theodore S. Papatheodorou,
Constantine D. Polychronopoulos,
Jesús Labarta

pp 23–37https://doi.org/10.1145/377792.377801

This paper explores previously established and novel methods for scaling the performance of OpenMP on NUMA architectures. The spectrum of methods under investigation includes OS-level automatic page placement algorithms, dynamic page migrationd manual ...

- 9
- 369
Metrics
Total Citations9
Total Downloads369
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Fractal symbolic analysis

Nikolay Mateev,
Vijay Menon,
Keshav Pingali

pp 38–49https://doi.org/10.1145/377792.377804

Modern compilers perform wholesale restructuring of programs to improve their efficiency. Dependence analysis is the most widely used technique for proving the correctness of such transformations, but it suffers from the limitation that it considers ...

- 9
- 284
Metrics
Total Citations9
Total Downloads284
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Data locality enhancement by memory reduction

Yonghong Song,
Rong Xu,
Cheng Wang,
Zhiyuan Li

pp 50–64https://doi.org/10.1145/377792.377806

In this paper, we propose memory reduction as a new approach to data locality enhancement. Under this approach, we use the compiler to reduce the size of the data repeatedly referenced in a collection of nested loops. Between their reuses, the data will ...

- 42
- 690
Metrics
Total Citations42
Total Downloads690
Last 12 Months6
Last 6 weeks0

Abstract
Get Access

Article

Eliminating redundancies in sum-of-product array computations

Steven J. Deitz,
Bradford L. Chamberlain,
Lawrence Snyder

pp 65–77https://doi.org/10.1145/377792.377807

Array programming languages such as Fortran 90, High Performance Fortran and ZPL are well-suited to scientific computing because they free the scientist from the responsibility of managing burdensome low-level details that complicate programming in ...

- 37
- 259
Metrics
Total Citations37
Total Downloads259
Last 12 Months11
Last 6 weeks4

Abstract
Get Access

Article

Monotonic evolution: an alternative to induction variable substitution for dependence analysis

Peng Wu,
Albert Cohen,
Jay Hoeflinger,
David Padua

pp 78–91https://doi.org/10.1145/377792.377809

We present a new approach to dependence testing in the presence of induction variables. Instead of looking for closed form expressions, our method computes monotonic evolution which captures the direction in which the value of a variable changes. This ...

- 24
- 426
Metrics
Total Citations24
Total Downloads426
Last 12 Months5
Last 6 weeks0

Abstract
Get Access

Article

Optimizing strategies for telescoping languages: procedure strength reduction and procedure vectorization

Arun Chauhan,
Ken Kennedy

pp 92–101https://doi.org/10.1145/377792.377812

At Rice University, we have undertaken a project to construct a framework for generating high-level problem solving languages that can achieve high performance on a variety of platforms.The underlying strategy, called telescoping languages, builds ...

- 13
- 369
Metrics
Total Citations13
Total Downloads369
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

Loop optimization for a class of memory-constrained computations

D. Cociorva,
J. W. Wilkins,
C. Lam,
G. Baumgartner,
J. Ramanujam,
P. Sadayappan

pp 103–113https://doi.org/10.1145/377792.377814

Compute-intensive multi-dimensional summations that involve products of several arrays arise in the modeling of electronic structure of materials. Sometimes several alternative formulations of a computation, representing different space-time trade-offs, ...

- 25
- 430
Metrics
Total Citations25
Total Downloads430
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Fast parallel in-memory 64-bit sorting

Daniel Jiménez-González,
Juan J. Navarro,
Josep-L. Larrba-Pey

pp 114–122https://doi.org/10.1145/377792.377816

Parallel in-memory 64-bit sorting is an important problem in Database Management Systems and other applications such as Internet Search Engines and Data Mining Tools.

We propose a new algorithm that we call Parallel Counting Split Radix sort, PCS-Radix ...

- 7
- 636
Metrics
Total Citations7
Total Downloads636
Last 12 Months19
Last 6 weeks0

Abstract
Get Access

Article

Optimizing locality for ODE solvers

Thomas Rauber,
Gudula Rüger

pp 123–132https://doi.org/10.1145/377792.377818

Runge-Kutta methods are popular methods for the solution of systems of ordinary differential equations and are provided by many scientific libraries. The performance of Runge-Kutta methods does not only depend on the specific application problem to be ...

- 11
- 620
Metrics
Total Citations11
Total Downloads620
Last 12 Months1
Last 6 weeks1

Abstract
Get Access

Article

Array language support for parallel sparse computation

Bradford L. Chamberlain,
Lawrence Snyder

pp 133–145https://doi.org/10.1145/377792.377820

This paper describes an array-based language-level approach to parallel sparse computation. Our approach is unique due to its separation of sparse index sets from arrays, both syntactically and in the implementation. This design allows users to express ...

- 6
- 329
Metrics
Total Citations6
Total Downloads329
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

A parallel algorithm for sparse symbolic LU factorization without pivoting on out—of—core matrices

Michel Cosnard,
Laura Grigori

pp 146–153https://doi.org/10.1145/377792.377823

Finding the nonzero structures of the lower and upper triangular factors of an unsymmetric sparse matrix A is an important problem in the field of sparse matrix computations. Complementing previous research on sequential algorithms, we develop a ...

- 2
- 442
Metrics
Total Citations2
Total Downloads442
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

Tools for application-oriented performance tuning

John Mellor-Crummey,
Robert Fowler,
David Whalley

pp 154–165https://doi.org/10.1145/377792.377826

Application performance tuning is a complex process that requires assembling various types of information and correlating it with source code to pinpoint the causes of performance bottlenecks. Existing performance tools don't adequately support this ...

- 38
- 478
Metrics
Total Citations38
Total Downloads478
Last 12 Months3
Last 6 weeks0

Abstract
Get Access

Article

Global optimization techniques for automatic parallelization of hybrid applications

Dhruva R. Chakrabarti,
Prithviraj Banerjee

pp 166–180https://doi.org/10.1145/377792.377827

This paper presents a novel technique to perform global optimization of communication and preprocessing calls in the presence of array accesses with arbitrary subscripts. Our scheme is presented in the context of automatic parallelization of sequential ...

- 0
- 323
Metrics
Total Citations0
Total Downloads323
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Tuning high-performance scientific codes: the use of performance models to control resource usage during data migration and I/O

Jonghyun Lee,
Marianne Winslett,
Xiaosong Ma,
Shengke Yu

pp 181–195https://doi.org/10.1145/377792.377829

Large-scale parallel simulations are a popular tool for investigating phenomena ranging from nuclear explosions to protein folding. These codes produce copious output that must be moved to the workstation where it will be visualized. Scientists have a ...

- 7
- 349
Metrics
Total Citations7
Total Downloads349
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

Computer aided hand tuning (CAHT): “applying case-based reasoning to performance tuning”

Antoine Monsifrot,
François Bodin

pp 196–203https://doi.org/10.1145/377792.377831

For most parallel and high performance systems, tuning guides provide the users with advices to optimize the execution time of their programs. Execution time may be very sensitive to small program changes. Such modifications may be local (on loop) or ...

- 6
- 341
Metrics
Total Citations6
Total Downloads341
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

Cache performance for multimedia applications

Nathan T. Slingerland,
Alan Jay Smith

pp 204–217https://doi.org/10.1145/377792.377833

The caching behavior of multimedia applications has been described as having high instruction reference locality within small loops, very large working sets, and poor data cache performance due to non-locality of data references. Despite this, there is ...

- 39
- 1,595
Metrics
Total Citations39
Total Downloads1,595
Last 12 Months6
Last 6 weeks0

Abstract
Get Access

Article

On the potential of tolerant region reuse for multimedia applications

Carlos Álvarez,
Jesús Corbal,
Esther Salamí,
Mateo Valero

pp 218–228https://doi.org/10.1145/377792.377835

The recent years have shown an interesting evolution in the mid-end to low-end embedded domain. Portable systems are growing in importance as they improve in storage capacity and in interaction capabilities with general purpose systems. Furthermore, ...

- 13
- 227
Metrics
Total Citations13
Total Downloads227
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

Evaluation of processor code efficiency for embedded systems

Morgan Hirosuke Miki,
Mamoru Sakamoto,
Shingo Miyamoto,
Yoshinori Takeuchi,
Toyohiko Yoshida,
Isao Shirakawa

pp 229–235https://doi.org/10.1145/377792.377837

This paper evaluates the code efficiency of the ARM, Java, and x86 instruction sets by compiling the SPEC CPU95/ CPU2000/JVM98 and CaffeineMark benchmarks, in terms of code sizes, basic block sizes, instruction distributions, and average instruction ...

- 1
- 611
Metrics
Total Citations1
Total Downloads611
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Improving 3D geometry transformations on a simultaneous multithreaded SIMD processor

Claude Limousin,
Julien Sebot,
Alexis Vartanian,
Nathalie Drach-Temam

pp 236–245https://doi.org/10.1145/377792.377839

In this paper we evaluate the performance of an SMT processor used as the geometry processor for a 3D polygonal rendering engine. To evaluate this approach, we consider PMesa (a parallel version of Mesa) which parallelizes the geometry stage of the 3D ...

- 12
- 535
Metrics
Total Citations12
Total Downloads535
Last 12 Months14
Last 6 weeks1

Abstract
Get Access

Article

Bringing together automatic differentiation and OpenMP

H. Martin Bücker,
Bruno Lang,
Dieter an Mey,
Christian H. Bischof

pp 246–251https://doi.org/10.1145/377792.377842

Derivatives of almost arbitrary functions can be evaluated efficiently by automatic differentiation whenever the functions are given in the form of computer programs in a high-level programming language such as Fortran, C, or C++. Furthermore, in ...

- 18
- 319
Metrics
Total Citations18
Total Downloads319
Last 12 Months6
Last 6 weeks3

Abstract
Get Access

Article

Automatic code generation for a turbulence scheme

Paul van der Mark,
Gerard Cats,
Lex Wolters

pp 252–259https://doi.org/10.1145/377792.377846

In this paper we describe how to extend CTADEL, a Problem Solving Environment, in order to generate code for a turbulence scheme, in our case, within a numerical weather prediction model (NWP). Common for these schemes is the presence of implicit ...

- 3
- 195
Metrics
Total Citations3
Total Downloads195
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

Towards the effective parallel computation of matrix pseudospectra

C. Bekas,
E. Kokiopoulou,
I. Koutis,
E. Gallopoulos

pp 260–269https://doi.org/10.1145/377792.377847

Given a matrix A, the computation of its pseudospectrum A∈ (A) is a far more expensive task than the computation of characteristics such as the condition number and the matrix spectrum. As research of the last 15 years has shown, however, the matrix ...

- 2
- 320
Metrics
Total Citations2
Total Downloads320
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

A graphical tool for driving the parallel computation of pseudosprectra

Dany Mezher

pp 270–276https://doi.org/10.1145/377792.377848

This paper presents the programming environment of a new tool for the parallel computation of Pseudospectra. Based on the PPA Talgorithm described in [16, 17], this algorithm offers total reliability and can handle singularities along the level curve ...

- 2
- 202
Metrics
Total Citations2
Total Downloads202
Last 12 Months2
Last 6 weeks0

Abstract
Get Access

Article

Register-sensitive selection, duplication, and sequencing of instructions

Vivek Sarkar,
Mauricio J. Serrano,
Barbara B. Simons

pp 277–288https://doi.org/10.1145/377792.377849

In this paper, we present a new framework for selecting, duplicating and sequencing instructions so as to decrease register pressure. The motivation for this work is to target current and future high-performance processors where reductions in register ...

- 7
- 642
Metrics
Total Citations7
Total Downloads642
Last 12 Months3
Last 6 weeks0

Abstract
Get Access

Article

Load and store reuse using register file contents

Soner Önder,
Rajiv Gupta

pp 289–302https://doi.org/10.1145/377792.377850

The detection of opportunities for value reuse optimizations in memory operations require both the addresses and values associated with these operations to be available. Although the values are typically available in the physical register file, their ...

- 23
- 546
Metrics
Total Citations23
Total Downloads546
Last 12 Months14
Last 6 weeks0

Abstract
Get Access

Article

Improving Gang Scheduling through job performance analysis and malleability

Julita Corbalan,
Xavier Martorell,
Jesus Labarta

pp 303–311https://doi.org/10.1145/377792.377852

The OpenMP programming model provides parallel applications a very important feature: job malleability. Job malleability is the capacity of an application to dynamically adapt its parallelism to the number of processors allocated to it. We believe that ...

- 13
- 499
Metrics
Total Citations13
Total Downloads499
Last 12 Months2
Last 6 weeks0

Abstract
Get Access

Article

Reducing the complexity of the issue logic

Ramon Canal,
Antonio González

pp 312–320https://doi.org/10.1145/377792.377854

The issue logic of dynamically scheduled superscalar processors is one of their most complex and power-consuming parts. In this paper we present alternative issue-logic designs that are much simpler than the traditional scheme while they retain most of ...

- 49
- 538
Metrics
Total Citations49
Total Downloads538
Last 12 Months8
Last 6 weeks1

Abstract
Get Access

Article

Slice-processors: an implementation of operation-based prediction

Andreas Moshovos,
Dionisios N. Pnevmatikatos,
Amirali Baniasadi

pp 321–334https://doi.org/10.1145/377792.377856

We describe the Slice Processor micro-architecture that implements a generalized operation-based prefetching mechanism. Operation-based prefetchers predict the series of operations, or the computation slice that can be used to calculate forthcoming ...

- 111
- 440
Metrics
Total Citations111
Total Downloads440
Last 12 Months24
Last 6 weeks3

Abstract
Get Access

Save to Binder

Create a New Binder

Name

Contributors

Mario Mango Furnari
National Reseach Council of Italy (CNR), Institute of Applied Sciences and Intelligent Systems “Eduardo Caianiello”
- Publication Years1987 - 2006
- Publication counts22
- Citation count14
- Available for Download4
- Downloads (cumulative)2,070
- Downloads (12 months)84
- Downloads (6 weeks)14
- Average Downloads per Article518
- Average Citation per Article1
View Full Profile
Efstratios Gallopoulos
University of Patras
- Publication Years1985 - 2023
- Publication counts59
- Citation count423
- Available for Download8
- Downloads (cumulative)2,514
- Downloads (12 months)288
- Downloads (6 weeks)34
- Average Downloads per Article314
- Average Citation per Article7
View Full Profile

Index Terms

Proceedings of the 15th international conference on Supercomputing

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing
Read More
CompSysTech '14: Proceedings of the 15th International Conference on Computer Systems and Technologies
Read More
UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing
Read More

Acceptance Rates

ICS '01 Paper Acceptance Rate45of133submissions,34%Overall Acceptance Rate584of2,055submissions,28%

Year	Submitted	Accepted	Rate
ICS '21	157	39	25%
ICS '15	160	40	25%
ICS '14	160	34	21%
ICS '13	202	43	21%
ICS '06	141	37	26%
ICS '03	171	36	21%
ICS '02	144	31	22%
ICS '01	133	45	34%
ICS '00	122	33	27%
ICS '99	180	57	32%
ICS '97	135	45	33%
ICS '96	116	50	43%
ICS '95	120	49	41%
ICS '94	114	45	39%
Overall	2,055	584	28%

Comments

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Save to Binder

Index Terms

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing

CompSysTech '14: Proceedings of the 15th International Conference on Computer Systems and Technologies

UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing

Acceptance Rates