The Resource Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop, Steve Hoffman

Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop, Steve Hoffman

Label
Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop
Title
Apache Flume: distributed log collection for Hadoop
Title remainder
design and implement a series of Flume agents to send streamed data into Hadoop
Statement of responsibility
Steve Hoffman
Title variation
Design and implement a series of Flume agents to send streamed data into Hadoop
Creator
Author
Subject
Genre
Language
eng
Summary
If you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of Hadoop and the Hadoop File System (HDFS) is assumed
Member of
Cataloging source
UMI
http://library.link/vocab/creatorName
Hoffman, Steve
Dewey number
004.36
Illustrations
illustrations
Index
index present
LC call number
QA76.9.D5
Literary form
non fiction
Nature of contents
dictionaries
Series statement
Community experience distilled
http://library.link/vocab/subjectName
  • Electronic data processing
  • File organization (Computer science)
Label
Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop, Steve Hoffman
Link
http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=959552
Instantiates
Publication
Note
Includes index
Carrier category
online resource
Carrier category code
  • cr
Carrier MARC source
rdacarrier
Content category
text
Content type code
  • txt
Content type MARC source
rdacontent
Contents
  • Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Overview and Architecture; Flume 0.9; Flume 1.X (Flume-NG); The problem with HDFS and streaming data/logs; Sources, channels, and sinks; Flume events; Interceptors, channel selectors, and sink processors; Tiered data collection (multiple flows and/or agents); The Kite SDK; Summary; Chapter 2: A Quick Start Guide to Flume; Downloading Flume; Flume in Hadoop distributions; An overview of the Flume configuration file; Starting up with ""Hello, World!""; Summary
  • Chapter 3: ChannelsThe memory channel; The file channel; Spillable Memory Channel; Summary; Chapter 4: Sinks and Sink Processors; HDFS sink; Path and filename; File rotation; Compression codecs; Event Serializers; Text output; Text with headers; Apache Avro; User-provided Avro schema; File type; SequenceFile; DataStream; CompressedStream; Timeouts and workers; Sink groups; Load balancing; Failover; MorphlineSolrSink; Morphline configuration files; Typical SolrSink configuration; Sink configuration; ElasticSearchSink; LogStash Serializer; Dynamic Serializer; Summary
  • Chapter 5: Sources and Channel SelectorsThe problem with using tail; The Exec source; Spooling Directory Source; Syslog sources; The syslog UDP source; The syslog TCP source; The multiport syslog TCP source; JMS source; Channel selectors; Replicating; Multiplexing; Summary; Chapter 6: Interceptors, ETL, and Routing; Interceptors; Timestamp; Host; Static; Regular expression filtering; Regular expression extractor; Morphline interceptor; Custom interceptors; The plugins directory; Tiering flows; The Avro source/sink; Compressing Avro; SSL Avro flows; The Thrift source/sink
  • Using command-line AvroThe Log4J appender; The Log4J load-balancing appender; The embedded agent; Configuration and startup; Sending data; Shutdown; Routing; Summary; Chapter 7: Putting it All Together; Web logs to searchable UI; Setting up the web server; Configuring log rotation to the spool directory; Setting up the target -- Elasticsearch; Setting up Flume on collector/relay; Setting up Flume on the client; Creating more search fields with an interceptor; Setting up a better user interface -- Kibana; Archiving to HDFS; Summary; Chapter 8: Monitoring Flume; Monitoring the agent process
  • MonitNagios; Monitoring performance metrics; Ganglia; Internal HTTP server; Custom monitoring hooks; Summary; Chapter 9: There Is No Spoon -- the Realities of Real-time Distributed Data Collection; Transport time versus log time; Time zones are evil; Capacity planning; Considerations for multiple data centers; Compliance and data expiry; Summary; Index
Dimensions
unknown
Edition
Second edition.
Extent
1 online resource (1 volume)
Form of item
online
Isbn
9781784399146
Media category
computer
Media MARC source
rdamedia
Media type code
  • c
Note
eBooks on EBSCOhost
Other physical details
illustrations
Sound
unknown sound
Specific material designation
remote
System control number
  • (OCoLC)906041062
  • (OCoLC)ocn906041062
Label
Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop, Steve Hoffman
Link
http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=959552
Publication
Note
Includes index
Carrier category
online resource
Carrier category code
  • cr
Carrier MARC source
rdacarrier
Content category
text
Content type code
  • txt
Content type MARC source
rdacontent
Contents
  • Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Overview and Architecture; Flume 0.9; Flume 1.X (Flume-NG); The problem with HDFS and streaming data/logs; Sources, channels, and sinks; Flume events; Interceptors, channel selectors, and sink processors; Tiered data collection (multiple flows and/or agents); The Kite SDK; Summary; Chapter 2: A Quick Start Guide to Flume; Downloading Flume; Flume in Hadoop distributions; An overview of the Flume configuration file; Starting up with ""Hello, World!""; Summary
  • Chapter 3: ChannelsThe memory channel; The file channel; Spillable Memory Channel; Summary; Chapter 4: Sinks and Sink Processors; HDFS sink; Path and filename; File rotation; Compression codecs; Event Serializers; Text output; Text with headers; Apache Avro; User-provided Avro schema; File type; SequenceFile; DataStream; CompressedStream; Timeouts and workers; Sink groups; Load balancing; Failover; MorphlineSolrSink; Morphline configuration files; Typical SolrSink configuration; Sink configuration; ElasticSearchSink; LogStash Serializer; Dynamic Serializer; Summary
  • Chapter 5: Sources and Channel SelectorsThe problem with using tail; The Exec source; Spooling Directory Source; Syslog sources; The syslog UDP source; The syslog TCP source; The multiport syslog TCP source; JMS source; Channel selectors; Replicating; Multiplexing; Summary; Chapter 6: Interceptors, ETL, and Routing; Interceptors; Timestamp; Host; Static; Regular expression filtering; Regular expression extractor; Morphline interceptor; Custom interceptors; The plugins directory; Tiering flows; The Avro source/sink; Compressing Avro; SSL Avro flows; The Thrift source/sink
  • Using command-line AvroThe Log4J appender; The Log4J load-balancing appender; The embedded agent; Configuration and startup; Sending data; Shutdown; Routing; Summary; Chapter 7: Putting it All Together; Web logs to searchable UI; Setting up the web server; Configuring log rotation to the spool directory; Setting up the target -- Elasticsearch; Setting up Flume on collector/relay; Setting up Flume on the client; Creating more search fields with an interceptor; Setting up a better user interface -- Kibana; Archiving to HDFS; Summary; Chapter 8: Monitoring Flume; Monitoring the agent process
  • MonitNagios; Monitoring performance metrics; Ganglia; Internal HTTP server; Custom monitoring hooks; Summary; Chapter 9: There Is No Spoon -- the Realities of Real-time Distributed Data Collection; Transport time versus log time; Time zones are evil; Capacity planning; Considerations for multiple data centers; Compliance and data expiry; Summary; Index
Dimensions
unknown
Edition
Second edition.
Extent
1 online resource (1 volume)
Form of item
online
Isbn
9781784399146
Media category
computer
Media MARC source
rdamedia
Media type code
  • c
Note
eBooks on EBSCOhost
Other physical details
illustrations
Sound
unknown sound
Specific material designation
remote
System control number
  • (OCoLC)906041062
  • (OCoLC)ocn906041062

Library Locations

  • Architecture LibraryBorrow it
    Gould Hall 830 Van Vleet Oval Rm. 105, Norman, OK, 73019, US
    35.205706 -97.445050
  • Bizzell Memorial LibraryBorrow it
    401 W. Brooks St., Norman, OK, 73019, US
    35.207487 -97.447906
  • Boorstin CollectionBorrow it
    401 W. Brooks St., Norman, OK, 73019, US
    35.207487 -97.447906
  • Chinese Literature Translation ArchiveBorrow it
    401 W. Brooks St., RM 414, Norman, OK, 73019, US
    35.207487 -97.447906
  • Engineering LibraryBorrow it
    Felgar Hall 865 Asp Avenue, Rm. 222, Norman, OK, 73019, US
    35.205706 -97.445050
  • Fine Arts LibraryBorrow it
    Catlett Music Center 500 West Boyd Street, Rm. 20, Norman, OK, 73019, US
    35.210371 -97.448244
  • Harry W. Bass Business History CollectionBorrow it
    401 W. Brooks St., Rm. 521NW, Norman, OK, 73019, US
    35.207487 -97.447906
  • History of Science CollectionsBorrow it
    401 W. Brooks St., Rm. 521NW, Norman, OK, 73019, US
    35.207487 -97.447906
  • John and Mary Nichols Rare Books and Special CollectionsBorrow it
    401 W. Brooks St., Rm. 509NW, Norman, OK, 73019, US
    35.207487 -97.447906
  • Library Service CenterBorrow it
    2601 Technology Place, Norman, OK, 73019, US
    35.185561 -97.398361
  • Price College Digital LibraryBorrow it
    Adams Hall 102 307 West Brooks St., Norman, OK, 73019, US
    35.210371 -97.448244
  • Western History CollectionsBorrow it
    Monnet Hall 630 Parrington Oval, Rm. 300, Norman, OK, 73019, US
    35.209584 -97.445414
Processing Feedback ...