Custom cover image
Custom cover image

Compiler Support for Big Data Processing (PhD Thesis)

By: Material type: TextTextLanguage: English Publication details: Karachi: NED University of Engineering and Technology Department of Computer and Information Systems Engineering, 2021Description: XII, 250 p. : illSubject(s): DDC classification:
  • 005.4530378242 HAM
Online resources: Summary: Abstract : Big data is a huge amount of real-time heterogeneous data which is unable to get tackled by traditional solutions. It is mainly broken into Volume, Velocity, & Variety (3Vs). Until now, the big data problem has been handled via domain-specific languages, databases, third-party libraries, or other application-level solutions, which are adopted without validating the 3Vs properties. Despite, these provide greater adaptability and ease of use, in comparison to the lower-level techniques. The benefits come at the cost of reduced overall performance, because the employed solutions neglect the optimization opportunities present at system software and hardware levels, resulting in inefficient utilization of the available resources. In this regard, the compiler layer can efficiently exploit the hardware resources through optimized machine code, hence support is needed in compilers for 3Vs. Numerous optimizations exist in the compiler which can assist in meeting the processing deadlines of big data workloads. However, the existing techniques are inappropriate for exploring compiler optimization space of big data workloads. To address the above issues, a set of mathematical equations has been proposed, identifying 3Vs. Then, based on these equations a framework has been proposed which automatically detects the 3Vs, using machine learning at the middle end of compiler. The tested applications showed a high detection accuracy for all 3Vs. The framework can easily be adopted in compilers and third-party libraries with minimal overheads and lesser programmer interventions. Additionally, an engine has been proposed for exploring compiler optimization space of big data workloads. The proposed engine accelerated the exploration and showed substantial metrics improvements for varying big data workloads. Also, the engine can easily be adopted by any architecture, compiler, and application. Finally, the exploration has further been accelerated by eliminating identical code execution. The proposed techniques can be employed in healthcare, social media, manufacturing, education, and other sectors to identify the presence of big data workloads and finding the suitable compiler optimizations for them.
Holdings
Item type Current library Shelving location Call number Status Date due Barcode
Reference Collection Reference Collection Government Document Section Govt Publication Section 005.4530378242 HAM Available 97714
Reference Collection Reference Collection Government Document Section Govt Publication Section 005.4530378242 HAM Available 97715

Abstract :

Big data is a huge amount of real-time heterogeneous data which is unable to get tackled by traditional solutions. It is mainly broken into Volume, Velocity, & Variety (3Vs). Until now, the big data problem has been handled via domain-specific languages, databases, third-party libraries, or other application-level solutions, which are adopted without validating the 3Vs properties. Despite, these provide greater adaptability and ease of use, in comparison to the lower-level techniques. The benefits come at the cost
of reduced overall performance, because the employed solutions neglect the
optimization opportunities present at system software and hardware levels, resulting in inefficient utilization of the available resources. In this regard, the compiler layer can efficiently exploit the hardware resources through optimized machine code, hence support is needed in compilers for 3Vs. Numerous optimizations exist in the compiler which can assist in meeting the processing deadlines of big data workloads. However, the existing techniques are inappropriate for exploring compiler optimization space of big data workloads. To address the above issues, a set of mathematical equations has been proposed, identifying 3Vs. Then, based on these equations a framework has been proposed which automatically detects the 3Vs, using machine learning at the middle end of compiler. The tested applications showed a high detection accuracy for all 3Vs. The framework can easily be adopted in compilers and third-party libraries with minimal overheads and lesser programmer interventions. Additionally, an engine has been proposed for exploring compiler optimization space of big data workloads. The proposed engine accelerated the exploration and showed substantial metrics improvements for varying big data workloads. Also, the engine can easily be adopted by any architecture, compiler, and application. Finally, the exploration has further been accelerated by eliminating identical code execution. The proposed techniques can be employed in healthcare, social media, manufacturing, education, and other sectors to identify the presence of big data workloads and finding the suitable compiler optimizations for them.