The Resource Conquering big data with high performance computing, Ritu Arora, editor

Conquering big data with high performance computing, Ritu Arora, editor

Label
Conquering big data with high performance computing
Title
Conquering big data with high performance computing
Statement of responsibility
Ritu Arora, editor
Contributor
Editor
Subject
Genre
Language
eng
Summary
This book provides an overview of the resources and research projects that are bringing Big Data and High Performance Computing (HPC) on converging tracks. It demystifies Big Data and HPC for the reader by covering the primary resources, middleware, applications, and tools that enable the usage of HPC platforms for Big Data management and processing. Through interesting use-cases from traditional and non-traditional HPC domains, the book highlights the most critical challenges related to Big Data processing and management, and shows ways to mitigate them using HPC resources. Unlike most books on Big Data, it covers a variety of alternatives to Hadoop, and explains the differences between HPC platforms and Hadoop. Written by professionals and researchers in a range of departments and fields, this book is designed for anyone studying Big Data and its future directions. Those studying HPC will also find the content valuable
Cataloging source
YDX
Dewey number
004.11
Index
no index present
LC call number
QA76.88
Literary form
non fiction
Nature of contents
  • dictionaries
  • bibliography
http://library.link/vocab/relatedWorkOrContributorName
Arora, Ritu
http://library.link/vocab/subjectName
  • High performance computing
  • Big data
  • COMPUTERS
  • COMPUTERS
  • COMPUTERS
  • COMPUTERS
  • COMPUTERS
  • COMPUTERS
  • COMPUTERS
  • Big data
  • High performance computing
  • Systems analysis & design
  • Algorithms & data structures
  • Databases
Label
Conquering big data with high performance computing, Ritu Arora, editor
Link
https://ezproxy.lib.ou.edu/login?url=http://link.springer.com/10.1007/978-3-319-33742-5
Instantiates
Publication
Copyright
Bibliography note
Includes bibliographical references
Carrier category
online resource
Carrier category code
cr
Carrier MARC source
rdacarrier
Content category
text
Content type code
txt
Content type MARC source
rdacontent
Contents
  • Preface; Contents; 1 An Introduction to Big Data, High Performance Computing, High-Throughput Computing, and Hadoop; 1.1 Big Data; 1.2 High Performance Computing (HPC); 1.2.1 HPC Platform; 1.2.2 Serial and Parallel Processing on HPC Platform; 1.3 High-Throughput Computing (HTC); 1.4 Hadoop; 1.4.1 Hadoop-Related Technologies; 1.4.2 Some Limitations of Hadoop and Hadoop-Related Technologies; 1.5 Convergence of Big Data, HPC, HTC, and Hadoop; 1.6 HPC and Big Data Processing in Cloud and at Open-Science Data Centers; 1.7 Conclusion; References
  • 2 Using High Performance Computing for Conquering Big Data2.1 Introduction; 2.2 The Big Data Life Cycle; 2.3 Technologies and Hardware Platforms for Managing the Big Data Life Cycle; 2.4 Managing Big Data Life Cycle on HPC Platforms at Open-Science Data Centers; 2.4.1 TACC Resources and Usage Policies; 2.4.2 End-to-End Big Data Life Cycle on TACC Resources; 2.5 Use Case: Optimization of Nuclear Fusion Devices; 2.5.1 Optimization; 2.5.2 Computation on HPC; 2.5.3 Visualization Using GPUs; 2.5.4 Permanent Storage of Valuable Data; 2.6 Conclusions; References
  • 3 Data Movement in Data-Intensive High Performance Computing3.1 Introduction; 3.2 Node-Level Data Movement; 3.2.1 Case Study: ADAMANT; 3.2.2 Case Study: Energy Cost of Data Movement; 3.3 System-Level Data Movement; 3.3.1 Case Study: Graphs; 3.3.2 Case Study: Map Reduce; 3.4 Center-Level Data Movement; 3.4.1 Case Study: Spider; 3.4.2 Case Study: Gordon and Oasis; 3.5 About the Authors; References; 4 Using Managed High Performance Computing Systems for High-Throughput Computing; 4.1 Introduction; 4.2 What Are We Trying to Do?; 4.2.1 Deductive Computation; 4.2.2 Inductive Computation
  • 4.2.2.1 High-Throughput Computing4.3 Hurdles to Using HPC Systems for HTC; 4.3.1 Runtime Limits; 4.3.2 Jobs-in-Queue Limits; 4.3.3 Dynamic Job Submission Restrictions; 4.3.4 Solutions from Resource Managers and Big Data Research; 4.3.5 A Better Solution for Managed HPC Systems; 4.4 Launcher; 4.4.1 How Launcher Works; 4.4.2 Guided Example: A Simple Launcher Bundle; 4.4.2.1 Step 1: Create a Job File; 4.4.2.2 Step 2: Build a SLURM Batch Script; 4.4.3 Using Various Scheduling Methods; 4.4.3.1 Dynamic Scheduling; 4.4.3.2 Static Scheduling; 4.4.4 Launcher with Intel®Xeon Phi Coprocessors
  • 4.4.4.1 Offload4.4.4.2 Independent Workloads for Host and Coprocessor; 4.4.4.3 Symmetric Execution on Host and Phi; 4.4.5 Use Case: Molecular Docking and Virtual Screening; 4.5 Conclusion; References; 5 Accelerating Big Data Processing on Modern HPC Clusters; 5.1 Introduction; 5.2 Overview of Apache Hadoop and Spark; 5.2.1 Overview of Apache Hadoop Distributed File System; 5.2.2 Overview of Apache Hadoop MapReduce; 5.2.3 Overview of Apache Spark; 5.3 Overview of High-Performance Interconnects and Storage Architecture on Modern HPC Clusters
Dimensions
unknown
Extent
1 online resource
Form of item
online
Isbn
9783319337425
Media category
computer
Media MARC source
rdamedia
Media type code
c
Note
SpringerLink
Specific material designation
remote
System control number
  • (OCoLC)958864781
  • (OCoLC)ocn958864781
Label
Conquering big data with high performance computing, Ritu Arora, editor
Link
https://ezproxy.lib.ou.edu/login?url=http://link.springer.com/10.1007/978-3-319-33742-5
Publication
Copyright
Bibliography note
Includes bibliographical references
Carrier category
online resource
Carrier category code
cr
Carrier MARC source
rdacarrier
Content category
text
Content type code
txt
Content type MARC source
rdacontent
Contents
  • Preface; Contents; 1 An Introduction to Big Data, High Performance Computing, High-Throughput Computing, and Hadoop; 1.1 Big Data; 1.2 High Performance Computing (HPC); 1.2.1 HPC Platform; 1.2.2 Serial and Parallel Processing on HPC Platform; 1.3 High-Throughput Computing (HTC); 1.4 Hadoop; 1.4.1 Hadoop-Related Technologies; 1.4.2 Some Limitations of Hadoop and Hadoop-Related Technologies; 1.5 Convergence of Big Data, HPC, HTC, and Hadoop; 1.6 HPC and Big Data Processing in Cloud and at Open-Science Data Centers; 1.7 Conclusion; References
  • 2 Using High Performance Computing for Conquering Big Data2.1 Introduction; 2.2 The Big Data Life Cycle; 2.3 Technologies and Hardware Platforms for Managing the Big Data Life Cycle; 2.4 Managing Big Data Life Cycle on HPC Platforms at Open-Science Data Centers; 2.4.1 TACC Resources and Usage Policies; 2.4.2 End-to-End Big Data Life Cycle on TACC Resources; 2.5 Use Case: Optimization of Nuclear Fusion Devices; 2.5.1 Optimization; 2.5.2 Computation on HPC; 2.5.3 Visualization Using GPUs; 2.5.4 Permanent Storage of Valuable Data; 2.6 Conclusions; References
  • 3 Data Movement in Data-Intensive High Performance Computing3.1 Introduction; 3.2 Node-Level Data Movement; 3.2.1 Case Study: ADAMANT; 3.2.2 Case Study: Energy Cost of Data Movement; 3.3 System-Level Data Movement; 3.3.1 Case Study: Graphs; 3.3.2 Case Study: Map Reduce; 3.4 Center-Level Data Movement; 3.4.1 Case Study: Spider; 3.4.2 Case Study: Gordon and Oasis; 3.5 About the Authors; References; 4 Using Managed High Performance Computing Systems for High-Throughput Computing; 4.1 Introduction; 4.2 What Are We Trying to Do?; 4.2.1 Deductive Computation; 4.2.2 Inductive Computation
  • 4.2.2.1 High-Throughput Computing4.3 Hurdles to Using HPC Systems for HTC; 4.3.1 Runtime Limits; 4.3.2 Jobs-in-Queue Limits; 4.3.3 Dynamic Job Submission Restrictions; 4.3.4 Solutions from Resource Managers and Big Data Research; 4.3.5 A Better Solution for Managed HPC Systems; 4.4 Launcher; 4.4.1 How Launcher Works; 4.4.2 Guided Example: A Simple Launcher Bundle; 4.4.2.1 Step 1: Create a Job File; 4.4.2.2 Step 2: Build a SLURM Batch Script; 4.4.3 Using Various Scheduling Methods; 4.4.3.1 Dynamic Scheduling; 4.4.3.2 Static Scheduling; 4.4.4 Launcher with Intel®Xeon Phi Coprocessors
  • 4.4.4.1 Offload4.4.4.2 Independent Workloads for Host and Coprocessor; 4.4.4.3 Symmetric Execution on Host and Phi; 4.4.5 Use Case: Molecular Docking and Virtual Screening; 4.5 Conclusion; References; 5 Accelerating Big Data Processing on Modern HPC Clusters; 5.1 Introduction; 5.2 Overview of Apache Hadoop and Spark; 5.2.1 Overview of Apache Hadoop Distributed File System; 5.2.2 Overview of Apache Hadoop MapReduce; 5.2.3 Overview of Apache Spark; 5.3 Overview of High-Performance Interconnects and Storage Architecture on Modern HPC Clusters
Dimensions
unknown
Extent
1 online resource
Form of item
online
Isbn
9783319337425
Media category
computer
Media MARC source
rdamedia
Media type code
c
Note
SpringerLink
Specific material designation
remote
System control number
  • (OCoLC)958864781
  • (OCoLC)ocn958864781

Library Locations

  • Architecture LibraryBorrow it
    Gould Hall 830 Van Vleet Oval Rm. 105, Norman, OK, 73019, US
    35.205706 -97.445050
  • Bizzell Memorial LibraryBorrow it
    401 W. Brooks St., Norman, OK, 73019, US
    35.207487 -97.447906
  • Boorstin CollectionBorrow it
    401 W. Brooks St., Norman, OK, 73019, US
    35.207487 -97.447906
  • Chinese Literature Translation ArchiveBorrow it
    401 W. Brooks St., RM 414, Norman, OK, 73019, US
    35.207487 -97.447906
  • Engineering LibraryBorrow it
    Felgar Hall 865 Asp Avenue, Rm. 222, Norman, OK, 73019, US
    35.205706 -97.445050
  • Fine Arts LibraryBorrow it
    Catlett Music Center 500 West Boyd Street, Rm. 20, Norman, OK, 73019, US
    35.210371 -97.448244
  • Harry W. Bass Business History CollectionBorrow it
    401 W. Brooks St., Rm. 521NW, Norman, OK, 73019, US
    35.207487 -97.447906
  • History of Science CollectionsBorrow it
    401 W. Brooks St., Rm. 521NW, Norman, OK, 73019, US
    35.207487 -97.447906
  • John and Mary Nichols Rare Books and Special CollectionsBorrow it
    401 W. Brooks St., Rm. 509NW, Norman, OK, 73019, US
    35.207487 -97.447906
  • Library Service CenterBorrow it
    2601 Technology Place, Norman, OK, 73019, US
    35.185561 -97.398361
  • Price College Digital LibraryBorrow it
    Adams Hall 102 307 West Brooks St., Norman, OK, 73019, US
    35.210371 -97.448244
  • Western History CollectionsBorrow it
    Monnet Hall 630 Parrington Oval, Rm. 300, Norman, OK, 73019, US
    35.209584 -97.445414
Processing Feedback ...