50 hours of big data, PySpark, AWS, Scala, and Scraping.
(eVideo)

Book Cover
Average Rating
Published
[Place of publication not identified] : Packt Publishing, [2022].
Edition
[First edition].
ISBN
9781803237039, 1803237031
Physical Desc
1 online resource (1 video file (54 hr., 36 min.)) : sound, color.
Status

Description

Loading Description...

Also in this Series

Checking series information...

More Like This

Loading more titles like this title...

More Details

Format
eVideo
Language
English

Notes

General Note
"Updated in March 2022."
General Note
"AI Sciences."
Participants/Performers
Muhammad Ahmad, )instructor.
Description
Learn, build, and execute big data strategies with Scala and Spark, PySpark and AWS, data scraping and data mining with Python, and master MongoDB About This Video Data scraping and data mining for beginners to pro with Python Clear unfolding of concepts with examples in Python, Scrapy, Scala, PySpark, and MongoDB Master Big Data with PySpark and AWS In Detail Part 1 is designed to reflect the most in-demand Scala skills. It provides an in-depth understanding of core Scala concepts. We will wrap up with a discussion on Map Reduce and ETL pipelines using Spark from AWS S3 to AWS RDS (includes six mini-projects and one Scala Spark project). Part 2 covers PySpark to perform data analysis. You will explore Spark RDDs, Dataframes, a bit of Spark SQL queries, transformations, and actions that can be performed on the data using Spark RDDs and dataframes, the ecosystem of Spark and Hadoop, and their underlying architecture. You will also learn how we can leverage AWS storage, databases, computations, and how Spark can communicate with different AWS services. Part 3 is all about data scraping and data mining. You will cover important concepts such as Internet Browser execution and communication with the server, synchronous and asynchronous, parsing data in response from the server, tools for data scraping, Python requests module, and more. In Part 4, you will be using MongoDB to develop an understanding of the NoSQL databases. You will explore the basic operations and explore the MongoDB query, project and update operators. We will wind up this section with two projects: Developing a CRUD-based application using Django and MongoDB and implementing an ETL pipeline using PySpark to dump the data in MongoDB. By the end of this course, you will be able to relate the concepts and practical aspects of learned technologies with real-world problems. Audience This course is designed for absolute beginners who want to create intelligent solutions, study with actual data, and enjoy learning theory and then putting it into practice. Data scientists, machine learning experts, and drop shippers will all benefit from this training. A basic understanding of programming, HTML tags, Python, SQL, and Node JS is required. However, no prior knowledge of data scraping, and Scala is needed.
Local note
O'Reilly,O'Reilly Online Learning Platform: Academic Edition (EZproxy Access)

Reviews from GoodReads

Loading GoodReads Reviews.

Citations

APA Citation, 7th Edition (style guide)

Ahmad, M. (2022). 50 hours of big data, PySpark, AWS, Scala, and Scraping ([First edition].). Packt Publishing.

Chicago / Turabian - Author Date Citation, 17th Edition (style guide)

Ahmad, Muhammad, 1982-. 2022. 50 Hours of Big Data, PySpark, AWS, Scala, and Scraping. Packt Publishing.

Chicago / Turabian - Humanities (Notes and Bibliography) Citation, 17th Edition (style guide)

Ahmad, Muhammad, 1982-. 50 Hours of Big Data, PySpark, AWS, Scala, and Scraping Packt Publishing, 2022.

MLA Citation, 9th Edition (style guide)

Ahmad, Muhammad. 50 Hours of Big Data, PySpark, AWS, Scala, and Scraping [First edition]., Packt Publishing, 2022.

Note! Citations contain only title, author, edition, publisher, and year published. Citations should be used as a guideline and should be double checked for accuracy. Citation formats are based on standards as of August 2021.