!東京に戻る 新幹線で. ちょうど新幹線が行ったところでホームに上ったので 丸々1本分並んだことに. すぐ横でTVカメラが三脚立てて撮影していたので, きっと微塵も写っていないことでしょう. 窓際の電源席には座れなかったものの, なんとか無事席を確保して,一路東京へ. しかし,通路にも人がいっぱいでトイレに行くのも一苦労. まあ,座っていられるだけ,随分幸せなものです. !Programming with Tiles {{cateogry 論文読み}} 2006のPPoPPで提案されたHierarchically Tiled Arrayに 動的分割と重なり合いの二つの新しいクラスを追加した. 記述の容易さとパフォーマンスの比較による評価 ::HTA Hierarchically Tiled Arrays(HTAs) are arrays that may be partitioned into tiles. THese tiles can be conventional arays or lower level HTAs. Tiles can be distributed across processors in a distributed-memory machine or be stored in a single machine according to a user specified layout The C++ implementation of the HTA class is a library with ~18000 lines of code. It only contains header files, as most classes in the library are C:: templates to facilitate inlining. ::Dynamic partitioning cache oblivious algorithms, FLAME require dynamic changes of the tile layout. part/rmPartを追加. ←TBB requres more lines of code, variables, and data types than the HTA to express the same problem ::Overlapped tiling Stencil codes benefit from tiling, because they increase locality and determine data distribution when running in parallel. ← programmers create a shadow or ghost region around each tile that contains a copy of the elements of the neighbor tiles ← automatically or manually update ::Evaluation * 性能評価 ** sequential(行列積,LU分解,3D Jacobi) ** parallel(Parallel Merge, MG/LU NAS) * Readability/Productivity ** the programmng effort[17] ** the cyclomatic number[22] ** lines of code !A Portable Runtime Interface For Multi-Level Mmeory Hierarchies {{category 論文読み}} for moving data and computation through parallel machines with multi-level memory hierarchies for multi-core/SMP, Cell B.E/分散メモリクラスタ :: The Runtime Interface adaptation of the Sequoia compiler * initialize/setup of the machine, including communicaton resources and resources at all levels where tasks can be executed * data transfers between memory levels using asynchronous bulk transfers between arrays * task execution at specified levels of the machine * バルク転送を強化 * DISKもメモリも同様のインターフェイスで * Top APIとBottom API !Multiscalar Processors {{category 論文読み}} Multiscalar processors use a new, aggressive implementation paradigm for extractign large quantities of ILP from ordinary high level languages programs.