We a good story
Quick delivery in the UK

Books in the Synthesis Lectures on Computer Architecture series

Filter
Filter
Sort bySort Series order
  • by Christopher J. Nitta
    £32.49

    As the number of cores on a chip continues to climb, architects will need to address both bandwidth and power consumption issues related to the interconnection network. Electrical interconnects are not likely to scale well to a large number of processors for energy efficiency reasons, and the problem is compounded by the fact that there is a fixed total power budget for a die, dictated by the amount of heat that can be dissipated without special (and expensive) cooling and packaging techniques. Thus, there is a need to seek alternatives to electrical signaling for on-chip interconnection applications. Photonics, which has a fundamentally different mechanism of signal propagation, offers the potential to not only overcome the drawbacks of electrical signaling, but also enable the architect to build energy efficient, scalable systems. The purpose of this book is to introduce computer architects to the possibilities and challenges of working with photons and designing on-chip photonic interconnection networks.

  • by Karu Sankaralingam
    £32.49

    In this book we give an overview of modeling techniques used to describe computer systems to mathematical optimization tools. We give a brief introduction to various classes of mathematical optimization frameworks with special focus on mixed integer linear programming which provides a good balance between solver time and expressiveness. We present four detailed case studies -- instruction set customization, data center resource management, spatial architecture scheduling, and resource allocation in tiled architectures -- showing how MILP can be used and quantifying by how much it outperforms traditional design exploration techniques. This book should help a skilled systems designer to learn techniques for using MILP in their problems, and the skilled optimization expert to understand the types of computer systems problems that MILP can be applied to.

  • by Tor M. Aamodt
    £26.99

    Originally developed to support video games, graphics processor units (GPUs) are now increasingly used for general-purpose (non-graphics) applications ranging from machine learning to mining of cryptographic currencies. GPUs can achieve improved performance and efficiency versus central processing units (CPUs) by dedicating a larger fraction of hardware resources to computation. In addition, their general-purpose programmability makes contemporary GPUs appealing to software developers in comparison to domain-specific accelerators. This book provides an introduction to those interested in studying the architecture of GPUs that support general-purpose computing. It collects together information currently only found among a wide range of disparate sources. The authors led development of the GPGPU-Sim simulator widely used in academic research on GPU architectures.The first chapter of this book describes the basic hardware structure of GPUs and provides a brief overview of their history. Chapter 2 provides a summary of GPU programming models relevant to the rest of the book. Chapter 3 explores the architecture of GPU compute cores. Chapter 4 explores the architecture of the GPU memory system. After describing the architecture of existing systems, Chapters 3 and 4 provide an overview of related research. Chapter 5 summarizes cross-cutting research impacting both the compute core and memory system.This book should provide a valuable resource for those wishing to understand the architecture of graphics processor units (GPUs) used for acceleration of general-purpose applications and to those who want to obtain an introduction to the rapidly growing body of research exploring how to improve the architecture of these GPUs.

  • by Abhishek Bhattacharjee
    £50.99

    This book provides computer engineers, academic researchers, new graduate students, and seasoned practitioners an end-to-end overview of virtual memory. We begin with a recap of foundational concepts and discuss not only state-of-the-art virtual memory hardware and software support available today, but also emerging research trends in this space. The span of topics covers processor microarchitecture, memory systems, operating system design, and memory allocation. We show how efficient virtual memory implementations hinge on careful hardware and software cooperation, and we discuss new research directions aimed at addressing emerging problems in this space.Virtual memory is a classic computer science abstraction and one of the pillars of the computing revolution. It has long enabled hardware flexibility, software portability, and overall better security, to name just a few of its powerful benefits. Nearly all user-level programs today take for granted that they will have been freed from the burden of physical memory management by the hardware, the operating system, device drivers, and system libraries.However, despite its ubiquity in systems ranging from warehouse-scale datacenters to embedded Internet of Things (IoT) devices, the overheads of virtual memory are becoming a critical performance bottleneck today. Virtual memory architectures designed for individual CPUs or even individual cores are in many cases struggling to scale up and scale out to today's systems which now increasingly include exotic hardware accelerators (such as GPUs, FPGAs, or DSPs) and emerging memory technologies (such as non-volatile memory), and which run increasingly intensive workloads (such as virtualized and/or "e;big data"e; applications). As such, many of the fundamental abstractions and implementation approaches for virtual memory are being augmented, extended, or entirely rebuilt in order to ensure that virtual memory remains viable and performant in the years to come.

  • by Brandon Reagen
    £50.99

    Machine learning, and specifically deep learning, has been hugely disruptive in many fields of computer science. The success of deep learning techniques in solving notoriously difficult classification and regression problems has resulted in their rapid adoption in solving real-world problems. The emergence of deep learning is widely attributed to a virtuous cycle whereby fundamental advancements in training deeper models were enabled by the availability of massive datasets and high-performance computer hardware.This text serves as a primer for computer architects in a new and rapidly evolving field. We review how machine learning has evolved since its inception in the 1960s and track the key developments leading up to the emergence of the powerful deep learning techniques that emerged in the last decade. Next we review representative workloads, including the most commonly used datasets and seminal networks across a variety of domains. In addition to discussing the workloads themselves, we also detail the most popular deep learning tools and show how aspiring practitioners can use the tools with the workloads to characterize and optimize DNNs.The remainder of the book is dedicated to the design and optimization of hardware and architectures for machine learning. As high-performance hardware was so instrumental in the success of machine learning becoming a practical solution, this chapter recounts a variety of optimizations proposed recently to further improve future designs. Finally, we present a review of recent research published in the area as well as a taxonomy to help readers understand how various contributions fall in context.

  • by Natalie Enright Jerger
    £26.99

    This book targets engineers and researchers familiar with basic computer architecture concepts who are interested in learning about on-chip networks. This work is designed to be a short synthesis of the most critical concepts in on-chip network design. It is a resource for both understanding on-chip network basics and for providing an overview of state of-the-art research in on-chip networks. We believe that an overview that teaches both fundamental concepts and highlights state-of-the-art designs will be of great value to both graduate students and industry engineers. While not an exhaustive text, we hope to illuminate fundamental concepts for the reader as well as identify trends and gaps in on-chip network research. With the rapid advances in this field, we felt it was timely to update and review the state of the art in this second edition. We introduce two new chapters at the end of the book. We have updated the latest research of the past years throughout the book and also expanded our coverage of fundamental concepts to include several research ideas that have now made their way into products and, in our opinion, should be textbook concepts that all on-chip network practitioners should know. For example, these fundamental concepts include message passing, multicast routing, and bubble flow control schemes.

  • by James E. Smith
    £50.99

    Understanding and implementing the brain's computational paradigm is the one true grand challenge facing computer researchers. Not only are the brain's computational capabilities far beyond those of conventional computers, its energy efficiency is truly remarkable. This book, written from the perspective of a computer designer and targeted at computer researchers, is intended to give both background and lay out a course of action for studying the brain's computational paradigm. It contains a mix of concepts and ideas drawn from computational neuroscience, combined with those of the author.As background, relevant biological features are described in terms of their computational and communication properties. The brain's neocortex is constructed of massively interconnected neurons that compute and communicate via voltage spikes, and a strong argument can be made that precise spike timing is an essential element of the paradigm. Drawing from the biological features, a mathematics-based computational paradigm is constructed. The key feature is spiking neurons that perform communication and processing in space-time, with emphasis on time. In these paradigms, time is used as a freely available resource for both communication and computation.Neuron models are first discussed in general, and one is chosen for detailed development. Using the model, single-neuron computation is first explored. Neuron inputs are encoded as spike patterns, and the neuron is trained to identify input pattern similarities. Individual neurons are building blocks for constructing larger ensembles, referred to as "e;columns"e;. These columns are trained in an unsupervised manner and operate collectively to perform the basic cognitive function of pattern clustering. Similar input patterns are mapped to a much smaller set of similar output patterns, thereby dividing the input patterns into identifiable clusters. Larger cognitive systems are formed by combining columns into a hierarchical architecture. These higher level architectures are the subject of ongoing study, and progress to date is described in detail in later chapters. Simulation plays a major role in model development, and the simulation infrastructure developed by the author is described.

  • by Edouard Bugnion
    £29.49

    This book focuses on the core question of the necessary architectural support provided by hardware to efficiently run virtual machines, and of the corresponding design of the hypervisors that run them. Virtualization is still possible when the instruction set architecture lacks such support, but the hypervisor remains more complex and must rely on additional techniques.Despite the focus on architectural support in current architectures, some historical perspective is necessary to appropriately frame the problem. The first half of the book provides the historical perspective of the theoretical framework developed four decades ago by Popek and Goldberg. It also describes earlier systems that enabled virtualization despite the lack of architectural support in hardware.As is often the case, theory defines a necessary-but not sufficient-set of features, and modern architectures are the result of the combination of the theoretical framework with insights derived from practical systems. The second half of the book describes state-of-the-art support for virtualization in both x86-64 and ARM processors. This book includes an in-depth description of the CPU, memory, and I/O virtualization of these two processor architectures, as well as case studies on the Linux/KVM, VMware, and Xen hypervisors. It concludes with a performance comparison of virtualization on current-generation x86- and ARM-based systems across multiple hypervisors.

  • by Benjamin C. Lee
    £34.49

    An era of big data demands datacenters, which house the computing infrastructure that translates raw data into valuable information. This book defines datacenters broadly, as large distributed systems that perform parallel computation for diverse users. These systems exist in multiple forms-private and public-and are built at multiple scales. Datacenter design and management is multifaceted, requiring the simultaneous pursuit of multiple objectives. Performance, efficiency, and fairness are first-order design and management objectives, which can each be viewed from several perspectives. This book surveys datacenter research from a computer architect's perspective, addressing challenges in applications, design, management, server simulation, and system simulation. This perspective complements the rich bodies of work in datacenters as a warehouse-scale system, which study the implications for infrastructure that encloses computing equipment, and in datacenters as distributed systems, which employ abstract details in processor and memory subsystems. This book is written for first- or second-year graduate students in computer architecture and may be helpful for those in computer systems. The goal of this book is to prepare computer architects for datacenter-oriented research by describing prevalent perspectives and the state-of-the-art.

  • by Yakun Sophia Shao
    £29.49

    Hardware acceleration in the form of customized datapath and control circuitry tuned to specific applications has gained popularity for its promise to utilize transistors more efficiently. Historically, the computer architecture community has focused on general-purpose processors, and extensive research infrastructure has been developed to support research efforts in this domain. Envisioning future computing systems with a diverse set of general-purpose cores and accelerators, computer architects must add accelerator-related research infrastructures to their toolboxes to explore future heterogeneous systems. This book serves as a primer for the field, as an overview of the vast literature on accelerator architectures and their design flows, and as a resource guidebook for researchers working in related areas.

  • by Rajesh Bordawekar
    £32.49

    This book aims to achieve the following goals: (1) to provide a high-level survey of key analytics models and algorithms without going into mathematical details; (2) to analyze the usage patterns of these models; and (3) to discuss opportunities for accelerating analytics workloads using software, hardware, and system approaches. The book first describes 14 key analytics models (exemplars) that span data mining, machine learning, and data management domains. For each analytics exemplar, we summarize its computational and runtime patterns and apply the information to evaluate parallelization and acceleration alternatives for that exemplar. Using case studies from important application domains such as deep learning, text analytics, and business intelligence (BI), we demonstrate how various software and hardware acceleration strategies are implemented in practice. This book is intended for both experienced professionals and students who are interested in understanding core algorithms behind analytics workloads. It is designed to serve as a guide for addressing various open problems in accelerating analytics workloads, e.g., new architectural features for supporting analytics workloads, impact on programming models and runtime systems, and designing analytics systems.

  • by Yu-Ting Chen
    £32.49

    Since the end of Dennard scaling in the early 2000s, improving the energy efficiency of computation has been the main concern of the research community and industry. The large energy efficiency gap between general-purpose processors and application-specific integrated circuits (ASICs) motivates the exploration of customizable architectures, where one can adapt the architecture to the workload. In this Synthesis lecture, we present an overview and introduction of the recent developments on energy-efficient customizable architectures, including customizable cores and accelerators, on-chip memory customization, and interconnect optimization. In addition to a discussion of the general techniques and classification of different approaches used in each area, we also highlight and illustrate some of the most successful design examples in each category and discuss their impact on performance and energy efficiency. We hope that this work captures the state-of-the-art research and development on customizable architectures and serves as a useful reference basis for further research, design, and implementation for large-scale deployment in future computing systems.

  • by Yuan Xie
    £39.99

    The emerging three-dimensional (3D) chip architectures, with their intrinsic capability of reducing the wire length, promise attractive solutions to reduce the delay of interconnects in future microprocessors. 3D memory stacking enables much higher memory bandwidth for future chip-multiprocessor design, mitigating the "e;memory wall"e; problem. In addition, heterogenous integration enabled by 3D technology can also result in innovative designs for future microprocessors. This book first provides a brief introduction to this emerging technology, and then presents a variety of approaches to designing future 3D microprocessor systems, by leveraging the benefits of low latency, high bandwidth, and heterogeneous integration capability which are offered by 3D technology.

  • by Christopher J. Hughes
    £45.49

    Having hit power limitations to even more aggressive out-of-order execution in processor cores, many architects in the past decade have turned to single-instruction-multiple-data (SIMD) execution to increase single-threaded performance. SIMD execution, or having a single instruction drive execution of an identical operation on multiple data items, was already well established as a technique to efficiently exploit data parallelism. Furthermore, support for it was already included in many commodity processors. However, in the past decade, SIMD execution has seen a dramatic increase in the set of applications using it, which has motivated big improvements in hardware support in mainstream microprocessors. The easiest way to provide a big performance boost to SIMD hardware is to make it wider-i.e., increase the number of data items hardware operates on simultaneously. Indeed, microprocessor vendors have done this. However, as we exploit more data parallelism in applications, certain challenges can negatively impact performance. In particular, conditional execution, non-contiguous memory accesses, and the presence of some dependences across data items are key roadblocks to achieving peak performance with SIMD execution. This book first describes data parallelism, and why it is so common in popular applications. We then describe SIMD execution, and explain where its performance and energy benefits come from compared to other techniques to exploit parallelism. Finally, we describe SIMD hardware support in current commodity microprocessors. This includes both expected design tradeoffs, as well as unexpected ones, as we work to overcome challenges encountered when trying to map real software to SIMD execution.

  • by Magnus Sjalander
    £32.49

    As Moore's Law and Dennard scaling trends have slowed, the challenges of building high-performance computer architectures while maintaining acceptable power efficiency levels have heightened. Over the past ten years, architecture techniques for power efficiency have shifted from primarily focusing on module-level efficiencies, toward more holistic design styles based on parallelism and heterogeneity. This work highlights and synthesizes recent techniques and trends in power-efficient computer architecture. Table of Contents: Introduction / Voltage and Frequency Management / Heterogeneity and Specialization / Communication and Memory Systems / Conclusions / Bibliography / Authors' Biographies

  • by Hari Angepat
    £29.49

    To date, the most common form of simulators of computer systems are software-based running on standard computers. One promising approach to improve simulation performance is to apply hardware, specifically reconfigurable hardware in the form of field programmable gate arrays (FPGAs). This manuscript describes various approaches of using FPGAs to accelerate software-implemented simulation of computer systems and selected simulators that incorporate those techniques. More precisely, we describe a simulation architecture taxonomy that incorporates a simulation architecture specifically designed for FPGA accelerated simulation, survey the state-of-the-art in FPGA-accelerated simulation, and describe in detail selected instances of the described techniques. Table of Contents: Preface / Acknowledgments / Introduction / Simulator Background / Accelerating Computer System Simulators with FPGAs / Simulation Virtualization / Categorizing FPGA-based Simulators / Conclusion / Bibliography / Authors' Biographies

  • by Ruby B. Lee
    £32.49

    Design for security is an essential aspect of the design of future computers. However, security is not well understood by the computer architecture community. Many important security aspects have evolved over the last several decades in the cryptography, operating systems, and networking communities. This book attempts to introduce the computer architecture student, researcher, or practitioner to the basic concepts of security and threat-based design. Past work in different security communities can inform our thinking and provide a rich set of technologies for building architectural support for security into all future computers and embedded computing devices and appliances. I have tried to keep the book short, which means that many interesting topics and applications could not be included. What the book focuses on are the fundamental security concepts, across different security communities, that should be understood by any computer architect trying to design or evaluate security-aware computer architectures.

  • by Michael L. Scott
    £39.99

    From driving, flying, and swimming, to digging for unknown objects in space exploration, autonomous robots take on varied shapes and sizes. In part, autonomous robots are designed to perform tasks that are too dirty, dull, or dangerous for humans. With nontrivial autonomy and volition, they may soon claim their own place in human society. These robots will be our allies as we strive for understanding our natural and man-made environments and build positive synergies around us. Although we may never perfect replication of biological capabilities in robots, we must harness the inevitable emergence of robots that synchronizes with our own capacities to live, learn, and grow. This book is a snapshot of motivations and methodologies for our collective attempts to transform our lives and enable us to cohabit with robots that work with and for us. It reviews and guides the reader to seminal and continual developments that are the foundations for successful paradigms. It attempts to demystify the abilities and limitations of robots. It is a progress report on the continuing work that will fuel future endeavors. Table of Contents: Part I: Preliminaries/Agency, Motion, and Anatomy/Behaviors / Architectures / Affect/Sensors / Manipulators/Part II: Mobility/Potential Fields/Roadmaps / Reactive Navigation / Multi-Robot Mapping: Brick and Mortar Strategy / Part III: State of the Art / Multi-Robotics Phenomena / Human-Robot Interaction / Fuzzy Control / Decision Theory and Game Theory / Part IV: On the Horizon / Applications: Macro and Micro Robots / References / Author Biography / Discussion

  • by Vijay Janapa Reddi
    £32.49

    Shrinking feature size and diminishing supply voltage are making circuits sensitive to supply voltage fluctuations within the microprocessor, caused by normal workload activity changes. If left unattended, voltage fluctuations can lead to timing violations or even transistor lifetime issues that degrade processor robustness. Mechanisms that learn to tolerate, avoid, and eliminate voltage fluctuations based on program and microarchitectural events can help steer the processor clear of danger, thus enabling tighter voltage margins that improve performance or lower power consumption. We describe the problem of voltage variation and the factors that influence this variation during processor design and operation. We also describe a variety of runtime hardware and software mitigation techniques that either tolerate, avoid, and/or eliminate voltage violations. We hope processor architects will find the information useful since tolerance, avoidance, and elimination are generalizable constructs that can serve as a basis for addressing other reliability challenges as well. Table of Contents: Introduction / Modeling Voltage Variation / Understanding the Characteristics of Voltage Variation / Traditional Solutions and Emerging Solution Forecast / Allowing and Tolerating Voltage Emergencies / Predicting and Avoiding Voltage Emergencies / Eliminiating Recurring Voltage Emergencies / Future Directions on Resiliency

  • by Mario Nemirovsky
    £32.49

    Multithreaded architectures now appear across the entire range of computing devices, from the highest-performing general purpose devices to low-end embedded processors. Multithreading enables a processor core to more effectively utilize its computational resources, as a stall in one thread need not cause execution resources to be idle. This enables the computer architect to maximize performance within area constraints, power constraints, or energy constraints. However, the architectural options for the processor designer or architect looking to implement multithreading are quite extensive and varied, as evidenced not only by the research literature but also by the variety of commercial implementations. This book introduces the basic concepts of multithreading, describes a number of models of multithreading, and then develops the three classic models (coarse-grain, fine-grain, and simultaneous multithreading) in greater detail. It describes a wide variety of architectural and software design tradeoffs, as well as opportunities specific to multithreading architectures. Finally, it details a number of important commercial and academic hardware implementations of multithreading. Table of Contents: Introduction / Multithreaded Execution Models / Coarse-Grain Multithreading / Fine-Grain Multithreading / Simultaneous Multithreading / Managing Contention / New Opportunities for Multithreaded Processors / Experimentation and Metrics / Implementations of Multithreaded Processors / Conclusion

  • by Hyesoon Kim
    £29.49

    General-purpose graphics processing units (GPGPU) have emerged as an important class of shared memory parallel processing architectures, with widespread deployment in every computer class from high-end supercomputers to embedded mobile platforms. Relative to more traditional multicore systems of today, GPGPUs have distinctly higher degrees of hardware multithreading (hundreds of hardware thread contexts vs. tens), a return to wide vector units (several tens vs. 1-10), memory architectures that deliver higher peak memory bandwidth (hundreds of gigabytes per second vs. tens), and smaller caches/scratchpad memories (less than 1 megabyte vs. 1-10 megabytes). In this book, we provide a high-level overview of current GPGPU architectures and programming models. We review the principles that are used in previous shared memory parallel platforms, focusing on recent results in both the theory and practice of parallel algorithms, and suggest a connection to GPGPU platforms. We aim to provide hints to architects about understanding algorithm aspect to GPGPU. We also provide detailed performance analysis and guide optimizations from high-level algorithms to low-level instruction level optimizations. As a case study, we use n-body particle simulations known as the fast multipole method (FMM) as an example. We also briefly survey the state-of-the-art in GPU performance analysis tools and techniques. Table of Contents: GPU Design, Programming, and Trends / Performance Principles / From Principles to Practice: Analysis and Tuning / Using Detailed Performance Analysis to Guide Optimization

  • by Samuel Midkiff
    £34.49

    Compiling for parallelism is a longstanding topic of compiler research. This book describes the fundamental principles of compiling "e;regular"e; numerical programs for parallelism. We begin with an explanation of analyses that allow a compiler to understand the interaction of data reads and writes in different statements and loop iterations during program execution. These analyses include dependence analysis, use-def analysis and pointer analysis. Next, we describe how the results of these analyses are used to enable transformations that make loops more amenable to parallelization, and discuss transformations that expose parallelism to target shared memory multicore and vector processors. We then discuss some problems that arise when parallelizing programs for execution on distributed memory machines. Finally, we conclude with an overview of solving Diophantine equations and suggestions for further readings in the topics of this book to enable the interested reader to delve deeper into the field. Table of Contents: Introduction and overview / Dependence analysis, dependence graphs and alias analysis / Program parallelization / Transformations to modify and eliminate dependences / Transformation of iterative and recursive constructs / Compiling for distributed memory machines / Solving Diophantine equations / A guide to further reading

  • by Naveen Muralimanohar
    £32.49

    As conventional memory technologies such as DRAM and Flash run into scaling challenges, architects and system designers are forced to look at alternative technologies for building future computer systems. This synthesis lecture begins by listing the requirements for a next generation memory technology and briefly surveys the landscape of novel non-volatile memories. Among these, Phase Change Memory (PCM) is emerging as a leading contender, and the authors discuss the material, device, and circuit advances underlying this exciting technology. The lecture then describes architectural solutions to enable PCM for main memories. Finally, the authors explore the impact of such byte-addressable non-volatile memories on future storage and system designs. Table of Contents: Next Generation Memory Technologies / Architecting PCM for Main Memories / Tolerating Slow Writes in PCM / Wear Leveling for Durability / Wear Leveling Under Adversarial Settings / Error Resilience in Phase Change Memories / Storage and System Design With Emerging Non-Volatile Memories

  • by Rajeev Balasubramonian
    £34.49

    A key determinant of overall system performance and power dissipation is the cache hierarchy since access to off-chip memory consumes many more cycles and energy than on-chip accesses. In addition, multi-core processors are expected to place ever higher bandwidth demands on the memory system. All these issues make it important to avoid off-chip memory access by improving the efficiency of the on-chip cache. Future multi-core processors will have many large cache banks connected by a network and shared by many cores. Hence, many important problems must be solved: cache resources must be allocated across many cores, data must be placed in cache banks that are near the accessing core, and the most important data must be identified for retention. Finally, difficulties in scaling existing technologies require adapting to and exploiting new technology constraints. The book attempts a synthesis of recent cache research that has focused on innovations for multi-core processors. It is an excellent starting point for early-stage graduate students, researchers, and practitioners who wish to understand the landscape of recent cache research. The book is suitable as a reference for advanced computer architecture classes as well as for experienced researchers and VLSI engineers. Table of Contents: Basic Elements of Large Cache Design / Organizing Data in CMP Last Level Caches / Policies Impacting Cache Hit Rates / Interconnection Networks within Large Caches / Technology / Concluding Remarks

  • by Kim Hazelwood
    £26.99

    Dynamic binary modification tools form a software layer between a running application and the underlying operating system, providing the powerful opportunity to inspect and potentially modify every user-level guest application instruction that executes. Toolkits built upon this technology have enabled computer architects to build powerful simulators and emulators for design-space exploration, compiler writers to analyze and debug the code generated by their compilers, software developers to fully explore the features, bottlenecks, and performance of their software, and even end-users to extend the functionality of proprietary software running on their computers. Several dynamic binary modification systems are freely available today that place this power into the hands of the end user. While these systems are quite complex internally, they mask that complexity with an easy-to-learn API that allows a typical user to ramp up fairly quickly and build any of a number of powerful tools. Meanwhile, these tools are robust enough to form the foundation for software products in use today. This book serves as a primer for researchers interested in dynamic binary modification systems, their internal design structure, and the wide range of tools that can be built leveraging these systems. The hands-on examples presented throughout form a solid foundation for designing and constructing more complex tools, with an appreciation for the techniques necessary to make those tools robust and efficient. Meanwhile, the reader will get an appreciation for the internal design of the engines themselves. Table of Contents: Dynamic Binary Modification: Overview / Using a Dynamic Binary Modifier / Program Analysis and Debugging / Active Program Modification / Architectural Exploration / Advanced System Internals / Historical Perspectives / Summary and Observations

  • by Tzvetan Metodi
    £39.99

    Quantum computers can (in theory) solve certain problems far faster than a classical computer running any known classical algorithm. While existing technologies for building quantum computers are in their infancy, it is not too early to consider their scalability and reliability in the context of the design of large-scale quantum computers. To architect such systems, one must understand what it takes to design and model a balanced, fault-tolerant quantum computer architecture. The goal of this lecture is to provide architectural abstractions for the design of a quantum computer and to explore the systems-level challenges in achieving scalable, fault-tolerant quantum computation. In this lecture, we provide an engineering-oriented introduction to quantum computation with an overview of the theory behind key quantum algorithms. Next, we look at architectural case studies based upon experimental data and future projections for quantum computation implemented using trapped ions. While we focus here on architectures targeted for realization using trapped ions, the techniques for quantum computer architecture design, quantum fault-tolerance, and compilation described in this lecture are applicable to many other physical technologies that may be viable candidates for building a large-scale quantum computing system. We also discuss general issues involved with programming a quantum computer as well as a discussion of work on quantum architectures based on quantum teleportation. Finally, we consider some of the open issues remaining in the design of quantum computers. Table of Contents: Introduction / Basic Elements for Quantum Computation / Key Quantum Algorithms / Building Reliable and Scalable Quantum Architectures / Simulation of Quantum Computation / Architectural Elements / Case Study: The Quantum Logic Array Architecture / Programming the Quantum Architecture / Using the QLA for Quantum Simulation: The Transverse Ising Model / Teleportation-Based Quantum Architectures / Concluding Remarks

  • by Dennis Abts
    £26.99

    Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applications to communicate and interoperate in an orchestrated and efficient way. This book describes the design and engineering tradeoffs of datacenter networks. It describes interconnection networks from topology and network architecture to routing algorithms, and presents opportunities for taking advantage of the emerging technology trends that are influencing router microarchitecture. With the emergence of "e;many-core"e; processor chips, it is evident that we will also need "e;many-port"e; routing chips to provide a bandwidth-rich network to avoid the performance limiting effects of Amdahl's Law. We provide an overview of conventional topologies and their routing algorithms and show how technology, signaling rates and cost-effective optics are motivating new network topologies that scale up to millions of hosts. The book also provides detailed case studies of two high performance parallel computer systems and their networks. Table of Contents: Introduction / Background / Topology Basics / High-Radix Topologies / Routing / Scalable Switch Microarchitecture / System Packaging / Case Studies / Closing Remarks

  • by Antonio Gonzalez
    £32.49

    This lecture presents a study of the microarchitecture of contemporary microprocessors. The focus is on implementation aspects, with discussions on their implications in terms of performance, power, and cost of state-of-the-art designs. The lecture starts with an overview of the different types of microprocessors and a review of the microarchitecture of cache memories. Then, it describes the implementation of the fetch unit, where special emphasis is made on the required support for branch prediction. The next section is devoted to instruction decode with special focus on the particular support to decoding x86 instructions. The next chapter presents the allocation stage and pays special attention to the implementation of register renaming. Afterward, the issue stage is studied. Here, the logic to implement out-of-order issue for both memory and non-memory instructions is thoroughly described. The following chapter focuses on the instruction execution and describes the different functional units that can be found in contemporary microprocessors, as well as the implementation of the bypass network, which has an important impact on the performance. Finally, the lecture concludes with the commit stage, where it describes how the architectural state is updated and recovered in case of exceptions or misspeculations. This lecture is intended for an advanced course on computer architecture, suitable for graduate students or senior undergrads who want to specialize in the area of computer architecture. It is also intended for practitioners in the industry in the area of microprocessor design. The book assumes that the reader is familiar with the main concepts regarding pipelining, out-of-order execution, cache memories, and virtual memory. Table of Contents: Introduction / Caches / The Instruction Fetch Unit / Decode / Allocation / The Issue Stage / Execute / The Commit Stage / References / Author Biographies

  • by Tim Harris
    £39.99

    The advent of multicore processors has renewed interest in the idea of incorporating transactions into the programming model used to write parallel programs. This approach, known as transactional memory, offers an alternative, and hopefully better, way to coordinate concurrent threads. The ACI (atomicity, consistency, isolation) properties of transactions provide a foundation to ensure that concurrent reads and writes of shared data do not produce inconsistent or incorrect results. At a higher level, a computation wrapped in a transaction executes atomically - either it completes successfully and commits its result in its entirety or it aborts. In addition, isolation ensures the transaction produces the same result as if no other transactions were executing concurrently. Although transactions are not a parallel programming panacea, they shift much of the burden of synchronizing and coordinating parallel computations from a programmer to a compiler, to a language runtime system, or to hardware. The challenge for the system implementers is to build an efficient transactional memory infrastructure. This book presents an overview of the state of the art in the design and implementation of transactional memory systems, as of early spring 2010. Table of Contents: Introduction / Basic Transactions / Building on Basic Transactions / Software Transactional Memory / Hardware-Supported Transactional Memory / Conclusions

  • by Lieven Eeckhout
    £29.49

    Performance evaluation is at the foundation of computer architecture research and development. Contemporary microprocessors are so complex that architects cannot design systems based on intuition and simple models only. Adequate performance evaluation methods are absolutely crucial to steer the research and development process in the right direction. However, rigorous performance evaluation is non-trivial as there are multiple aspects to performance evaluation, such as picking workloads, selecting an appropriate modeling or simulation approach, running the model and interpreting the results using meaningful metrics. Each of these aspects is equally important and a performance evaluation method that lacks rigor in any of these crucial aspects may lead to inaccurate performance data and may drive research and development in a wrong direction. The goal of this book is to present an overview of the current state-of-the-art in computer architecture performance evaluation, with a special emphasis on methods for exploring processor architectures. The book focuses on fundamental concepts and ideas for obtaining accurate performance data. The book covers various topics in performance evaluation, ranging from performance metrics, to workload selection, to various modeling approaches including mechanistic and empirical modeling. And because simulation is by far the most prevalent modeling technique, more than half the book's content is devoted to simulation. The book provides an overview of the simulation techniques in the computer designer's toolbox, followed by various simulation acceleration techniques including sampled simulation, statistical simulation, parallel simulation and hardware-accelerated simulation. Table of Contents: Introduction / Performance Metrics / Workload Design / Analytical Performance Modeling / Simulation / Sampled Simulation / Statistical Simulation / Parallel Simulation and Hardware Acceleration / Concluding Remarks

Join thousands of book lovers

Sign up to our newsletter and receive discounts and inspiration for your next reading experience.