Citation Link: https://doi.org/10.25819/ubsi/10740
Distributed Time-Triggered Caching and Memory Access Optimisation for Neural Network Tensor Accelerators in Multicore Safety-Critical Systems
Alternate Title
Verteiltes Time-Triggered Caching und Speicherzugriffsoptimierung für Neural Network Tensor Accelerators in Multicore Safety-Critical Systems
Source Type
Doctoral Thesis
Author
Issue Date
2025
Abstract
Neural network accelerators are essential for meeting the computational demands of modern AI applications; however, their use in safety-critical and real-time environments presents significant challenges, primarily due to inefficiencies in memory access and interference from other applications, leading to unpredictable memory access patterns. This dissertation addresses these memory access bottlenecks by proposing a time-triggered architecture that enhances the memory access mechanisms of tensor accelerators. Traditional accelerators, such as the Versatile Tensor Accelerator (VTA), encounter limitations related to memory bandwidth, resource contention, and variable latency, which impair performance in safety-critical, memory-intensive tasks. This work introduces the Time-Triggered Memory Access VTA (TTmaVTA), which applies time-triggered architectures to control and regulate the memory access of the VTA, ensuring predictable and conflict-free memory transactions. The TTmaVTA framework is further refined with dual memory optimisation techniques, prefetching, and caching mechanisms. These enhancements, collectively referred to as OPTTmaVTA, improve memory throughput while significantly reducing memory access latency. Prefetching mechanisms retrieve data during idle memory cycles, minimising delays due to dependency stalls, while deterministic caching optimises frequently accessed memory operations, reducing memory bus accesses. Together, these methods improve the memory performance of neural network accelerators while ensuring timing predictability, particularly in safety-critical contexts. This dissertation presents hardware experiments and software simulations that validate the effectiveness of TTmaVTA and OPTTmaVTA in improving memory access predictability and memory throughput. Hardware-based experiments using a Conv2D workload on an FPGA demonstrate that TTmaVTA achieves a 2.86% reduction in execution time, primarily due to improved memory scheduling and conflict resolution; however, resource overhead limits TTmaVTA scalability for larger workloads. Software simulations of OPTTmaVTA with ResNet-18 show a 12.68% improvement in memory access time through prefetching and an 8.75% gain through caching. Overall, the OPTTmaVTA architecture achieves improved memory throughput, with a total latency reduction of approximately 19.86% across all memory operations. In culmination, a scheduling algorithm maps memory access patterns to predefined schedules, ensuring deterministic execution and adherence to real-time constraints. Through a combination of theoretical analysis and practical evaluations, this work makes a substantial contribution to hardware-software co-design for neural network accelerators, particularly suited for applications in safety-critical domains.
File(s)![Thumbnail Image]()
Loading...
Name
Dissertation_Ezekiel_Aniebiet_Micheal.pdf
Size
9.53 MB
Format
Adobe PDF
Checksum
(MD5):02f8e9bf1db5e58bd2b5263eb7b06cd6
Owning collection