Introduction to Big Data

No Thumbnail Available

Keywords

information technology , IT , big data , data science , data analysis , Tableau , R and R Studio , Bunker Hill Community College , NRC , Northeast Resiliency Consortium

Project

Northeast Resiliency Consortium

Authors

Date Submitted

2016

Material Type

Syllabus

Secondary Material Type

Institution

Bunker Hill Community College

Industry Partner

N/A

License

CC BY

Funding Source

TAACCCT Round 3

Additional Public Access

Abstract

This is the course syllabus for Bunker Hill Community College’s Introduction to Big Data With R and R Studio course. Since the advent of smart phones, the recent increases in processor speed, and decreased data storage costs, data has moved to the forefront of computing in the business world. While once thought of as a secondary responsibility in business, the availability, cost and vast amount of data has changed the paradigm of business intelligence and data handling, also known as Big Data. The (McKensey & Company, 2011) report stated the amount of jobs in Big Data will grow by 190,000 for the junior level positions, and over 1.5million for managerial positions. Right now, there is a shortfall in Big Data employees, and the gap will continue to grow due to the growth and availability of data; this course is designed to help bridge the gap. Intro to Big Data with R and R-Studio introduces students to the field of Big Data, its concepts and technologies, as well as R and R-Studio. The course will explore the question “What is Big Data?” Students will explore the roles of a data scientist and what the 3 “V’s” of Big Data are (velocity, variety, and veracity). They will also learn various visualization techniques and best communication practices used in the industry. Students will learn to differentiate between raw data, clean data, and tidy data; and utilize tools to convert data to/from these formats. Students will learn how to effectively and efficiently manage and analyze data in a computing environment. Students will learn basic statistical tests to analyze large data sets, utilizing the R programming software and language. Students will also learn how to use the data visualization software Tableau. You do not have to purchase a book for this course, instead the course uses open source textbooks, along with labs from https://www.openintro.org/stat/textbook.php. The textbook is used in accordance with the Creative Commons BY-SA 3.0 license.

Industry (NAISC)

Information Technology -- Computer Operating Systems (Programming), Digital Forensics, Cyber Security, Network Security, IT Security (51)

Occupation (SOC)

Computer and Mathematical Occupations (15-0000)

Instructional Program (CIP)

Mathematics and Statistics (27)

Credit Type

Credential Type

Associate Degree

Educational Level

1st year Community College or equivalent
2nd Year Community College or equivalent

Skill Level

Quality Assurance Organization