banner

Data Processing and Analytics

Purpose: to help other instructors teaching the same course

Common Course ID:  CIS 3200
CSU Instructor Open Textbook Adoption Portrait
Abstract: This open textbook is being utilized in a data processing & analytics course for undergraduate or graduate students by Jongwook Woo at California State University, Los Angeles . The open textbook provides Easy-to-understand explanations of foundational and practical lab concepts of Elasticsearch. Elasticsearch is a powerful open-source search and analytics engine, widely used for handling large-scale data and powering search functionality in various applications around the world . The main motivation to adopt an open textbook was its availability free of cost and recognition in the data processing & analytics field . Most student access the open textbook in online.

About the Course

Course Title and Number: Data Processing and Analytics - CIS 3200
Brief Description of course highlights:  Intensive and hands-on instruction in using software applications including spreadsheet, database management, application integration, data mining, data visualization and E-Collaboration using Elastic Search. By the end of the course, students will have foundational skills to build data analytics solutions to business problems.

Student population: Students of CIS 3200 are enrolled in the undergraduate Information Systems program. However, given the broad appeal of CIS 3200, it is not uncommon to see students from other Business Administration program, taking CIS 3200. Most students have a software development background, while others do not. Prerequisites include experience with at least one programming language by taking CIS 2830.

Learning or student outcomes:  Upon completion of the lecture section, the students will be able to:
• Identify Data and Predictive Analysis using Search Engine: Elasticsearch
• Learn Data Analytics, Machine Learning
• Learn how to use cloud computing
• Learn the fundamental theories and algorithms used to process, store, analyze, and predict Data using Machine Learning, Data Analytics, and Predictive Analytics
• See the use cases and examples of Data Machine Learning, Data Analytics, and Predictive Analytics in business.

Key challenges faced and how resolved: Students will engage in practical, hands-on exercises to reinforce their understanding of the concepts, which will require access to servers and lab tutorials. The material in this open textbook is designed to complement and support these lab sessions, providing a comprehensive learning experience.

About the Resource/Textbook 

Textbook or OER/Low cost Title: Elasticsearch - The Definitive Guide: 2.x

Brief Description:  The world is overwhelmed with data, and while existing technologies focus on storage, the real challenge lies in making real-time, data-informed decisions. Elasticsearch, a distributed, scalable, real-time search and analytics engine, enables you to search, analyze, and explore your data in unexpected ways. This book introduces the core concepts needed to start working with Elasticsearch, covering both full-text search and real-time analytics. It also delves into advanced techniques, including structured search, language complexities, geolocation, and data modeling, all while explaining how to configure and monitor your Elasticsearch cluster for scalability and production use.
Please provide a link to the resource https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html

Authors:  Clinton Gormley

Student access:  The textbook is complemented with slides developed by the instructor based on the explanations and examples in the textbook and shared with students via the CalStateLA’s learning management system Canvas. 

Provide the cost savings from that of a traditional textbook.  The cost of a textbook in Amazon rounds 35 USD.

License: Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

OER/Low Cost Adoption

OER/Low Cost Adoption Process

Provide an explanation or what motivated you to use this textbook or OER/Low Cost option. Ease of access. Students can access the textbook from anywhere using their browser and focus only on the parts they are interested.

How did you find and select the open textbook for this course? ? I have been collaborating with the company of Elasticsearch.

Sharing Best Practices: For faculty starting with OER or low-cost options, begin small by replacing a few resources and collaborate with experienced colleagues. Use platforms of the company as well as OER Commons or OpenStax for easy-to-access materials, and take advantage of institutional support. Customize it to fit your course, and focus on long-term sustainability. Gather student feedback to refine resources.
What I wish I had known earlier: engaging with the OER community is invaluable, implementing OER takes time but pays off in the long run, and the flexibility to adapt resources offers more control over course content.

Describe any key challenges you experienced, how they were resolved  and lessons learned.   A key challenge is designing course and lab content that caters to the diversity of the student cohort's technical backgrounds. Course and lab material should be easy to follow and reproduce for the students. 

About the Instructor

Instructor Name - Jongwook Woo
I am an Information System professor at California State University, Los Angeles. I teach Big Data Analysis and Science

Please provide a link to your university page.
https://www.calstatela.edu/faculty/jongwook-woo

Please describe the courses you teach  CIS3200 DATA PROCESSNG & ANALYTICS and CIS4560 INTRODUCTION TO BIG DATA. The courses focus on both conceptual and practical knowledge of Big Data systems for storing and processing large-scale data. Students will learn how to collect, engineer, analyze, and visualize data, while also gaining skills in predicting future trends using machine learning and AI within the business domain.

Describe your teaching philosophy and any research interests related to your discipline or teaching.  I encourage students to focus on understanding practical concepts. My research is centered on Data and Predictive Analysis using Big Data technologies, including Hadoop, Hive, Spark, Deep Learning, and IoT, exploring parallel and distributed computing within the business domain.