Distributed Time-Triggered Caching and Memory Access Optimisation for Neural Network Tensor Accelerators in Multicore Safety-Critical Systems

Ezekiel , Aniebiet Micheal

doi:10.25819/ubsi/10740

Citation Link: https://doi.org/10.25819/ubsi/10740

Distributed Time-Triggered Caching and Memory Access Optimisation for Neural Network Tensor Accelerators in Multicore Safety-Critical Systems

Alternate Title

Verteiltes Time-Triggered Caching und Speicherzugriffsoptimierung für Neural Network Tensor Accelerators in Multicore Safety-Critical Systems

Source Type

Doctoral Thesis

Author

Ezekiel , Aniebiet Micheal

Institute

Fakultät IV - Naturwissenschaftlich-Technische Fakultät

Subjects

Time-Triggered architecture

Memory Access optimisation

Tensor accelerator

Safety-Critical systems

Memory prefetching

Deterministic caching

Predictable memory access

Scheduling algorithm

DDC

004 Informatik

GHBS-Clases

Issue Date

2025

Abstract

Neural network accelerators are essential for meeting the computational demands of modern AI applications; however, their use in safety-critical and real-time environments presents significant challenges, primarily due to inefficiencies in memory access and interference from other applications, leading to unpredictable memory access patterns. This dissertation addresses these memory access bottlenecks by proposing a time-triggered architecture that enhances the memory access mechanisms of tensor accelerators. Traditional accelerators, such as the Versatile Tensor Accelerator (VTA), encounter limitations related to memory bandwidth, resource contention, and variable latency, which impair performance in safety-critical, memory-intensive tasks. This work introduces the Time-Triggered Memory Access VTA (TTmaVTA), which applies time-triggered architectures to control and regulate the memory access of the VTA, ensuring predictable and conflict-free memory transactions. The TTmaVTA framework is further refined with dual memory optimisation techniques, prefetching, and caching mechanisms. These enhancements, collectively referred to as OPTTmaVTA, improve memory throughput while significantly reducing memory access latency. Prefetching mechanisms retrieve data during idle memory cycles, minimising delays due to dependency stalls, while deterministic caching optimises frequently accessed memory operations, reducing memory bus accesses. Together, these methods improve the memory performance of neural network accelerators while ensuring timing predictability, particularly in safety-critical contexts. This dissertation presents hardware experiments and software simulations that validate the effectiveness of TTmaVTA and OPTTmaVTA in improving memory access predictability and memory throughput. Hardware-based experiments using a Conv2D workload on an FPGA demonstrate that TTmaVTA achieves a 2.86% reduction in execution time, primarily due to improved memory scheduling and conflict resolution; however, resource overhead limits TTmaVTA scalability for larger workloads. Software simulations of OPTTmaVTA with ResNet-18 show a 12.68% improvement in memory access time through prefetching and an 8.75% gain through caching. Overall, the OPTTmaVTA architecture achieves improved memory throughput, with a total latency reduction of approximately 19.86% across all memory operations. In culmination, a scheduling algorithm maps memory access patterns to predefined schedules, ensuring deterministic execution and adherence to real-time constraints. Through a combination of theoretical analysis and practical evaluations, this work makes a substantial contribution to hardware-software co-design for neural network accelerators, particularly suited for applications in safety-critical domains.

DOI

10.25819/ubsi/10740

URN

nbn:de:hbz:467-70412

URI

https://dspace.ub.uni-siegen.de/handle/ubsi/7041

License

http://creativecommons.org/licenses/by/4.0/

File(s)