CC3047_T T
 
Aula n.º 1 de 14-02-2024
Introduction to large scale data and Python libraries for handling large scale data.
Aula n.º 2 de 21-02-2024
Introduction to parallelism. Shared memory programming model and architecture.
Aula n.º 3 de 28-02-2024
Distributed programming model and architecture. Loop restructuring and Python multiprocessing pool.
Aula n.º 4 de 06-03-2024
Parallel and Distributed platforms. High Performance Computing, High Throuhput Computing, Cloud Computing.
Aula n.º 5 de 13-03-2024
Introduction to virtual machines and virtualization. The MapReduce model.
Aula n.º 6 de 20-03-2024
Introduction to Apache Spark.
Aula n.º 7 de 10-04-2024
First Test. Duration 1h 50 min.
Topics:

Concept of cloud and types of clouds
Concept of virtualization
Types of computer architectures
programming models, data distribution
advantages and disadvantages
characteristics
Exercises given in practical classes

Suggested book chapters and sections (Cloud Computing - Theory and Practice, by Dan Marinescu. 1st edition - Chapters and sections may change if you use the second edition)
- C1: intro, s1.3, s1.4, s1.5, s1.6, s1.7
- C2: intro, s2.1, s2.2, s2.9, s2.10
- C3: intro, s3.2, s3.7, s3.8, s3.9, s3.10
- C4: intro, s4.1, s4.2, s4.6, s4.7, s4.8, s4.9, s4.10
- C5: intro, s5.1, s5.2, s5.3, s5.4
- C6: intro
Aula n.º 8 de 17-04-2024
Apache Beam, Dask and other Python libraries.
Aula n.º 9 de 24-04-2024
Modin and joblib Python libraries.
Graph Neural Networks.
Aula n.º 10 de 15-05-2024
Using logic as knowledge representation and as an alternative to graph networks.
Introduction to GPU programming.
Aula n.º 11 de 22-05-2024
Opportunities for parallelization: programming and data.
Aula n.º 12 de 29-05-2024
Second test. Duration: 1h 50 min.
Contents:

Data distribution and schedulers
apache beam, dask, modin, joblib
GNNs and pytorch geometric
cupy, numba, cudnn, rapids-ai

Review the links suggested in theoretical and practical classes
Review practical classes