Learning Apache Drill - Query and Analyze Distributed Data Sources with SQL (Paperback)

,
Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you’ll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis Query file types including logfiles, Parquet, JSON, and other complex formats Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL Connect to Drill programmatically using a variety of languages Use Drill even with challenging or ambiguous file formats Perform sophisticated analysis by extending Drill’s functionality with user-defined functions Facilitate data analysis for network security, image metadata, and machine learning

R1,031
List Price R1,455
Save R424 29%

Or split into 4x interest-free payments of 25% on orders over R50
Learn more

Discovery Miles10310
Mobicred@R97pm x 12* Mobicred Info
Free Delivery
Delivery AdviceShips in 12 - 17 working days


Toggle WishListAdd to wish list
Review this Item

Product Description

Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you’ll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis Query file types including logfiles, Parquet, JSON, and other complex formats Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL Connect to Drill programmatically using a variety of languages Use Drill even with challenging or ambiguous file formats Perform sophisticated analysis by extending Drill’s functionality with user-defined functions Facilitate data analysis for network security, image metadata, and machine learning

Customer Reviews

No reviews or ratings yet - be the first to create one!

Product Details

General

Imprint

O'Reilly Media

Country of origin

United States

Release date

November 2018

Availability

Expected to ship within 12 - 17 working days

Authors

,

Dimensions

233 x 178 x 18mm (L x W x T)

Format

Paperback

Pages

334

ISBN-13

978-1-4920-3279-3

Barcode

9781492032793

Categories

LSN

1-4920-3279-4



Trending On Loot