
Online programs
Advance your career from anywhere in the world. With flexible online courses taught by the same expert faculty who teach on campus, you’ll earn a world-renowned IU degree—on your own schedule.

Advance your career from anywhere in the world. With flexible online courses taught by the same expert faculty who teach on campus, you’ll earn a world-renowned IU degree—on your own schedule.
Level up your technical expertise with the online Certificate in Artificial Intelligence and become the expert companies need.
Quickly and conveniently acquire new skills in topics such as data analysis, cloud computing, health and medicine, statistics, and data mining.
The online M.S. in Data Science from the Luddy School offers working professional the flexibility to advance their careers while gaining specialized knowledge in data science.
An available list of online courses offered for our Online Master’s and Online Certificate programs are listed below. Please check the official Schedule of Classes for section numbers and instructor of record each Fall, Spring, and Summer term.
If you need additional content regarding a course, you are welcome to reach out to the instructor directly.
Note: Unless otherwise specified, all courses listed are worth three (3) credit hours.
Algorithms are central to computer-related tasks. This course introduces the meta-task of algorithm building as well as individual algorithms. It uses mathematical tools to design and analyze algorithms and includes hands-on coding experience. The course is intended especially for non-CS majors and for students who are more interested in applications than theory.
Upon completion of this course, students should be able to:
Prerequisite(s): Experience with programming, data structures, and algorithms is assumed. Assignments involve a substantial amount of programming in Python. The course also includes mathematics such as linear algebra, probability theory, and basic calculus.
This course covers the fundamentals of artificial intelligence and is intended for M.S. students, early Ph.D. students, advanced undergraduate students in Computer Science and Data Science, and students in related fields who have a strong computing background.
Topics include, tentatively:
Applications: computer vision, natural language processing, and robotics.
Prerequisite(s): Programming in imperative, functional, and object-oriented styles is assumed, along with knowledge of algebra, elements of discrete math, and data structures and algorithms.
This course introduces database concepts and systems. Topics include database models and systems, especially relational, object-oriented, semi-structured, and graph data models; query languages and database programming; database design and modeling; components of query processing; data structures and algorithms for efficient query processing; and an introduction to transaction management, including concurrency and recovery.
Prerequisite(s): CSCI B551 or equivalent is required. Proficiency in a general-purpose programming language such as Python or C/C++ is expected. Students should be able to implement basic matrix operations using basic data structures in the language. Exposure to linear algebra, basic calculus, machine learning, graph theory, probability theory, geometry, and statistics is extremely helpful.
This introductory course in computer vision provides a broad overview of the field, with some emphasis on topics that reflect current research trends such as object recognition and deep learning. The course emphasizes algorithms, mathematical models, and techniques that apply broadly across vision and other areas of AI and computer science.
Topics include, tentatively:
Prerequisite(s): Successful completion of the Entrance Exam with a score of 6/10 is required. After completing the exam, the score must be forwarded to the Luddy Office of Online Education by email to grant permission.
This course is designed for students who want to become machine learning practitioners, better problem solvers, or future machine learning researchers. It introduces theoretical concepts and algorithms in a step-by-step manner, supported by intuition, examples, and Python Jupyter notebooks. Students study core machine learning algorithms and work through numerous example applications of machine learning. The course emphasizes practical understanding, helping students learn how machine learning algorithms work, how to use them, and how to avoid common pitfalls.
For students with a stronger interest in machine learning theory and development, the course provides an optional track that explores theory more deeply and culminates in coding core machine learning algorithms from scratch and potentially extending them.
Prerequisite(s): Not specified.
Database systems are central to data science because they store and manage data. Relational databases have supported major industries for decades and remain widely used. In the era of big data, the database landscape continues to change, and non-relational databases have become an important part of enterprise data architecture. Relational databases were developed long before the Internet and the Web to address centralized data storage and management. NoSQL databases emerged alongside Internet and Web applications to connect companies with customers and to support agile development in rapidly changing environments. The need for agility and for handling data variability and data integration has driven enterprises toward NoSQL database technology. This course provides a basic overview of the current database landscape and its tools, beginning with relational databases and extending to NoSQL databases such as MongoDB, Neo4j, Cassandra, and Redis.
Prerequisite(s): R, Python, and Statistics.
This course is designed to develop practical skills needed for applied data science research. It is organized around the stages of the data science workflow: setting expectations, exploratory data analysis, modeling, interpreting results, and communicating results. The course covers algorithms, best practices, and evaluation criteria. Both good and bad application examples are discussed to help students build stronger intuition about algorithm and visualization choice, best practices, and methods for evaluating results. Lectures and readings provide the theoretical foundation, and assignments provide hands-on practice in developing practical skills.
This course provides a foundation in the use of modeling techniques in managerial decision-making. It covers three areas of modeling: forecasting, computer simulation, and optimization. Computer simulation is introduced and followed by more advanced topics. The course concentrates on input and output analysis for simulation models. In optimization, the course covers linear programming, integer programming, nonlinear programming, and genetic algorithms. Two weeks are devoted to forecasting, with a broad overview of key forecasting techniques.
Upon completion of this course, students should be able to:
Prerequisite(s): Machine Learning and Python
Natural language processing (NLP) is an essential skill in many data science tasks. Students encounter challenges in data wrangling, data collection, data analysis, and data understanding. This course introduces NLP basics and guides students through common NLP tasks for data analysis. In the first half of the course, students learn NLP processing skills. In the second half, they explore domain-specific NLP techniques for data analysis in healthcare, banking, marketing, customer service, and technology. This course prepares students for more advanced data science courses such as Machine Learning and Deep Learning, as well as linguistics-oriented courses such as Computational Linguistics.
This course provides a gentle yet intensive introduction to programming in Python for students with little or no prior programming experience. Python is an open-source language that supports rapid application development, is object-oriented by design, and provides an excellent platform for data science. The course emphasizes planning and organizing programs and developing high-quality software that solves real-world problems.
Students will:
This hands-on course provides a guided platform for learning and practicing time-series analysis. It covers time-series regression and exploratory data analysis, ARMA/ARIMA models, model identification and estimation, linear operators, Fourier analysis, spectral estimation, and state space models. Analyses are performed using the freely available packages astsa, xts, and zoo. Lectures and reading are required. R, including RStudio and RMarkdown, as well as GitHub and GitHub Desktop, are required.
Prerequisite(s): Basic proficiency in Python and data wrangling is required. Familiarity with data analysis tools such as Pandas and NumPy, as well as introductory knowledge of statistical analysis and machine learning, is recommended.
This course is designed for data science students who want to apply analytical skills to biomedical research using the NIH AllofUs Research Program data. It emphasizes the use of large-scale, diverse datasets to explore real-world biomedical questions and focuses on hands-on group projects using the AllofUs (AoU) workbench. Students gain practical experience analyzing genomics, clinical, and socioeconomic data to address population health, disease risk, and personalized medicine applications.
Prerequisite(s): Basic proficiency in Python is required.
This course provides a practical introduction to fundamental and cutting-edge techniques in artificial intelligence. Students gain hands-on experience developing and applying AI models across various domains. The course is structured around seven modules, allowing students to tailor their learning experience by selecting at least four modules of interest. Each module introduces practical implementation of a specific AI methodology and prepares students to address real-world AI challenges. Students may enroll for 1–3 credit hours per academic term.
Upon completion of this course, students should be able to:
Prerequisite(s): Not specified.
This course presents data science as a means to an end: answering questions and solving problems for organizations such as companies, nonprofit organizations, investors, governments, regulators, journalists, employees, customers, and communities. It is designed to prepare students for careers that apply data science to business problems. This course emphasizes business applications of data science and aims to help students become better consumers of statistical information and better critics of how statistics are used in public and business contexts.
Upon completion of this course, students should be able to:
Prerequisite(s): To register, an offer letter from the hiring entity must be submitted to the Office of Online Education with a Graduate Internship form. Students should contact the Luddy Office of Online Education for further instructions.
Graduate internship credit may be awarded to students undertaking a significant experiential learning opportunity through a company, organization, nonprofit, or similar setting. Students are responsible for securing their own internships, although they may contact Luddy Career Services for assistance and resources. The internship must last at least 6 weeks and include no fewer than 160 hours of supervised work. A student may earn no more than three credit hours in the course, and the experience must be integral to the curriculum.
Prerequisite(s): M.S. students must be in their final year of the program or have completed at least 18 credit hours in the program.
This course is designed to help students experience the complexities and nuances of applying data science in the real world. Students work in teams on ongoing and new projects defined by a project sponsor, who may be an academic or an industry practitioner. Students work with the sponsor and their team members to understand the problem domain, define a role, identify where data science skills can be applied, and develop a solution. Much of the course focuses on moving from ambiguity to an achievable outcome. Weekly reading assignments address aspects of data science consulting and project management. The course emphasizes the learning experience over technical project outcomes.
Upon the completion of this course, students will have:
Prerequisite(s): Not specified.
This variable-credit, asynchronous course consists of several beginner and advanced mini-topics designed to build and enhance data science skills and technologies. Each topic spans 4–6 weeks and counts as one credit hour. Students may enroll for 1–3 credit hours per academic term, and topic selection is administered through the course Canvas site during the first week of each term. All topics include weekly discussion requirements and deadlines to support time management. Students enrolled for 3 credit hours should expect to spend 9–12 hours per week on three individual topics. Topics are designed to be completed sequentially or concurrently.
Note: No more than three credit hours of On-Ramp credit may be applied to the Data Science program requirements, effective Spring 2019.
Topics include:
Prerequisite(s): To register, a project proposal must be submitted to the Luddy Office of Online Education with an Independent Study form. Students should contact the Luddy Office of Online Education for further instructions.
Independent study courses allow students to conduct individualized projects under the supervision of a faculty member. Up to three credit hours may be earned to conduct research or to explore specific areas of data science not covered well by a formal course. The course is managed by a supervising faculty member in conjunction with the student’s proposed learning goals. The student and faculty member discuss and propose goals, topics, and projects.
Prerequisite(s): STAT S519 or equivalent; CSCI P556 strongly suggested.
This course teaches advanced machine learning concepts while also covering many signal processing applications. Students are exposed to signal processing applications through lectures and homework, including speech denoising, music source separation, stereo image matching, temporally ordered tweeter streams, EEG recordings, image segmentation, and related topics. Lectures are structured in a problem-solving format, with machine learning models introduced to address specific motivating problems. The course begins with basic unsupervised and supervised machine learning models and extends to more advanced topics including kernel methods, probabilistic topic modeling, hashing, Kalman filtering, boosting, and more. Students are strongly encouraged to have backgrounds in probability theory, optimization, and linear algebra. The course is homework-heavy and programming-oriented.
Prerequisite(s): A high comfort level with systems programming and debugging is expected. Assignments include nontrivial programming in a language of the student’s choice.
This course covers basic concepts of programming models and tools for cloud computing in support of data-intensive science applications. Students become familiar with current research topics in cloud platforms, parallel algorithms, storage, and high-level language support within a complex ecosystem of tools spanning multiple disciplines.
Course objectives include:
Prerequisite(s): Intermediate C experience and familiarity with Linux/Unix command-line utilities are required.
This course provides an entry-level, hands-on learning experience in supercomputing. It introduces the essential concepts, knowledge, and skills needed for a career in supercomputing or for effective use of HPC in other disciplines. The course also serves students interested in HPC engineering and design, software development, or system administration. It aims to develop a new generation of computer and computational scientists with expertise in the development, operation, and application of high-performance computing systems. The course is interdisciplinary, combining hardware technology and architecture, system software and tools, programming models, and application algorithms, with a focus on performance management and measurement. Experimental exercises provide hands-on reinforcement.
Topics include:
Prerequisite(s): STAT S519 and CSCI P556 are required; ENGR E511 or DSCI D590 Introduction to NLP for Data Science is helpful but not required.
This course provides a comprehensive introduction to deep learning. It begins with the basics of neural networks, principles of deeper neural networks, and optimization techniques specific to deep learning. It then introduces core deep learning models widely used across application areas, such as convolutional neural networks, recurrent neural networks including LSTMs and GRUs, and embedding models for text/language modeling and signal processing. The course also covers generative models, including variational autoencoders, generative adversarial networks, and autoregressive models. In addition, it addresses engineering aspects of neural networks, especially compression algorithms used to reduce the cost of runtime inference in hardware deployment. The course includes programming-oriented homework and final projects.
Prerequisite(s): Knowledge of a programming language is required, along with the ability to learn additional languages as needed and a willingness to enhance knowledge through online resources and additional literature. Access to a modern computer capable of running virtual machines and/or containers is needed. Knowledge from ENGR E516 is desirable and will make project execution easier. ENGR E516 and this course may be taken in parallel.
This course investigates the use of cloud-based data analytics for processing Big Data and solving problems in Big Data Applications and Analytics. Case studies such as Netflix recommender systems, genomic data, sports, health, and others are discussed.
Course objectives include:
The visual representation of information requires a deep understanding of human perceptual and cognitive capabilities, data mining and visualization algorithms, interface and interaction design, and creativity. Data such as Twitter data, books, or social networks is typically non-spatial and must be mapped into a physical space that faithfully and efficiently represents relationships in the information. When done successfully, data visualizations combine human and machine intelligence to solve tasks that neither could accomplish alone.
This course provides an overview of state-of-the-art information visualization. It teaches the process of producing effective temporal, geospatial, topical, and network visualizations. Students use tools such as Tableau, D3.js, OpenRefine, Gephi, and Plot.ly. Students also have the opportunity to collaborate on real-world projects for a variety of clients.
Topics include:
Prerequisite(s): Some programming background is necessary. No specific language is required, but students are expected to learn new languages as needed. One lab is related to buffer overflows in C. The course also assumes familiarity with the Linux command line.
This course provides an extensive survey of network security. It covers threats to information confidentiality, integrity, and availability across different Internet layers, as well as defense mechanisms that address those threats. The course also provides a foundation in network security, including cryptographic primitives/protocols, authentication, authorization, access control technologies, and hands-on experience through programming assignments and course projects.
This course uses the tools of economics to better understand computer security. It is not a course in economics research and does not aim to develop new economic theory. The required economics background is modest, and a strong mathematical background without economics is sufficient. There is no textbook.
At its core, the course is designed to improve decision-making for organizations that depend on security professionals. In addition to the fundamental language of decision-making, the course identifies dimensions of organizational and economic behavior that affect the success of technical security choices.
Prerequisite(s): Python, R, and C or C++ are required.
Machine learning techniques have been successful in analyzing biological data because they handle randomness, uncertainty, and data noise well and generalize effectively. This course introduces classical machine learning techniques such as Naive Bayes, principal component analysis, clustering, and neural networks using biological problems. It also covers recent developments and applications of dimensionality reduction and deep networks, along with their successful use in biological problem solving. Finally, the course covers probabilistic models, including Markov models, hidden Markov models, and Bayesian networks, for biological sequence analysis and systems biology.
Assessments consist of take-home assignments and five programming exercises.
Upon completion of this course, students will:
Prerequisite(s): A reasonable programming background is necessary. Courses in operating systems, networking, and computer architecture are helpful but not required. No particular language is required, though students are expected to learn new languages if needed.
This graduate-level course covers the design and analysis of secure systems. Topics include identifying security goals and risks, threat modeling, defense, integrating different technologies to achieve security goals, developing security protocols and policies, and implementing security protocols and secure coding. The course also studies real-world scenarios with many security requirements.
Prerequisite(s): Undergraduate-level expertise in computational thinking is expected, but not a strong programming background. Experience with the Linux file system and MySQL will be helpful.
Data is abundant, and that abundance offers opportunities for discovery as well as economic and social gain. However, data can be difficult to use, noisy, and inadequately contextualized. There can be a large gap between data and knowledge because of technological or policy limitations that prevent easy integration with other data. This course examines the principles and technologies needed to capture, clean, contextualize, store, access, and trust data for repurposed use. Students are introduced to the capabilities and benefits of big data, the key components of big data projects, and the major steps in data analysis and visualization.
Topics include:
Students are expected to spend 6–7 hours per week on readings, reflections, and instructional content.
Prerequisite(s): Good understanding and working knowledge of programming and open-source libraries is required, as the course uses the Python data and visualization stack extensively. Basic understanding of mathematics, statistics, and Web, including HTML, CSS, JavaScript, and JSON, is recommended.
Data visualization is widely used, from TV news to scientific papers and from home offices to large companies, to reveal patterns in data and tell stories. As more data is collected, more decisions are made through data analysis, making data visualization an essential skill for knowledge workers. This course introduces basic statistical data analysis and visualization. It covers the fundamentals of data visualization in the context of perception, integrity, design, statistics, data types, and visualization techniques. Hands-on exercises using the Python stack are an integral part of the course.
By the end of the course, students are expected to be able to understand, explain, and manipulate basic types of data, analyze them using basic exploratory visualization techniques, and create explanatory visualizations. Students will also be able to evaluate the effectiveness of visualizations based on human perception, design, data types, and visualization techniques.
Prerequisite(s): A strong foundation in mathematics, statistics, and programming is required, although there is no formal prerequisite. Key topics include probability, statistics, linear algebra, data structures, and algorithms. Python is the main programming language, and proficiency in Python is very helpful.
Networks, or graphs, provide a unifying framework for studying complex systems such as living organisms, societies, and many techno-social systems. This graduate-level course focuses on the fundamental concepts and key applications of network science. It covers recent advances in statistical properties and models of real-world networks, network algorithms, and practical applications. Topics include how information and diseases spread in society, measures and algorithms for quantifying importance, link prediction, and community detection.
By the end of the course, students are expected to be able to identify, construct, and analyze networks by choosing and applying appropriate methods and algorithms. Students are also expected to be able to explain, both mathematically and conceptually, the key network concepts and statistical properties and their implications.
Prerequisite(s): Knowledge of linear algebra and basic statistics is helpful.
With the exponential growth of the Web over the past decades, information flooding has become a major challenge. The success of GYM (Google, Yahoo, and MSN) has shown that information retrieval is a key component in helping users access target information based on their needs. This course introduces information retrieval theories and concepts underlying search applications. It investigates techniques used in modern search engines and demonstrates their significance through experimentation.
Upon completion of this course, students should be able to:
Prerequisite(s): Adequate knowledge of Python is required to read, modify, and write code independently.
This course introduces the growing field of social media mining. It explores what is meant by social media and why it is valuable to mine it. After establishing basic definitions and motivations, the course focuses on techniques and methods used to extract meaningful signals from the growing flood of social media data. The course includes hands-on guided exercises using Python and academic papers in which authors present their methods, research questions, and insights about mining the social web.
Prerequisite(s): Intermediate algebra skills are required, including comfort with functions, logarithms, and college-level mathematical notation. To register, students must email the Statistics Department at statdept@iu.edu and include their 10-digit UID.
This course introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics include 1- and 2-sample location problems, one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and statistical methods to practical situations or actual data sets.
Upon completion of this course, students should be able to:
Prerequisite(s): S519 is required for enrollment. Students should know how to calculate probabilities using software or otherwise for fundamental probability distributions such as the binomial and normal. Students should also know the forms and interpretations of t-tests, confidence intervals, and the simple linear regression line. Some experience with R is expected. To register, students must email the Statistics Department at statdept@iu.edu and include their 10-digit UID.
This course surveys statistical methods that do not rely on parametric assumptions. Knowledge of introductory statistics at the S320/S520 level is assumed, and the course serves in some ways as a sequel. It reviews parametric techniques learned in introductory courses and compares them with nonparametric alternatives to show when one technique outperforms another.
Topics include:
Prerequisite(s): To register, students must email the O’Neill Records Office at oneillrc@iu.edu and include their 10-digit UID.
This course applies statistical analysis to issues in public and environmental affairs and related fields. It covers descriptive statistics, statistical inference, the nature of random variables, sampling distributions, point and interval estimation of parameters such as the mean and standard deviation, hypothesis testing, analysis of variance, and bivariate and multivariate regression. The course emphasizes the practical application of these methods, the appropriate interpretation of results, and a meaningful understanding of how statistical analysis can be misused or executed incorrectly. The use of computer tools for statistical analysis, primarily SAS, is also a major emphasis.
Prerequisite(s): A graduate-level introductory statistics course covering the simple two-variable regression model and an introduction to multivariate regression is required. To register, students must email the O’Neill Records Office at oneillrc@iu.edu and include their 10-digit UID.
This course provides an intermediate-level perspective on statistical concepts and techniques for analyzing and modeling complex systems through regression analysis. It includes estimating model parameters from existing data, testing hypotheses about these systems, forecasting, correcting for violations of assumptions, and addressing common problems such as near multicollinearity. The course is primarily focused on single-equation regression models and their extension to a variety of situations, while also introducing simultaneous equation models. Applications of these techniques are drawn from public and environmental affairs as well as the broader social sciences.