How is a program parallelized?

Great question, it depends on the programming language and environment. What programming languages are you familiar with?

Which programing language is best for parallel computing?

It's hard to point to a single, best language. It really depends on what you want to do. For low-level concurrent programming, you could use C/C++ or Rust. For high-level concurrent programming, you can try Python or Go.

How come in the "Sequential Computing" section, it says the "numCats ← numCats + 1" operation is called 4 times even though there was only 1 cat in the example? Doesn't the code automatically stop/exit at "IF foundCat" line for images where a cat isn't detected?

The code will not automatically stop/exit; rather, the code will evaluate the conditional expression as _false_ and will then continue to the next iteration of the loop without executing "numCats ← numCats + 1". Also, it seems that the two examples are disparate. If the timetable given related to the example above it, then you are right in your assumption that "numCats ← numCats + 1" would only execute once, but in this case the example and table seem to be separate entities.

On the "check your understanding," how is the time taken to run the program sequentially 60 seconds? My understanding was that the initial sequential part which is unchanged in both programs is 18 seconds (32 total sec parallel - 14 sec on the images parallel), and since there are two images to be processed sequentially, each taking 14 seconds, the total time taken to run the program sequentially would be 18 + 2(14) = 46 seconds. Please correct me if I'm wrong.

The check your understanding is based on the cat detection program written above it which was shown to take 60 seconds to run sequentially. When parallelized on two processors, the time is equal to 4 (for the array initialization and variable assignment at the beginning) + 2 (for 4 images split between 2 processors) * 14 (the amount of time to process one image) = 32 seconds.

what is parallel programing

This article says, "A computational model that breaks programs into smaller sequential operations and performs those smaller operations in parallel." It's pretty much just breaking a task into smaller parts and dividing it up to make the task finished faster.

How does a parallel work?

Robert J. Oppenheimer wants to solve some special equations. He can sit down and do them all, or let several other people do chunks of it. This only works when each chunk has the same steps, but different parameters.

Main content

Course: AP®︎/College Computer Science Principles > Unit 4

Lesson 4: Parallel and distributed computing

Parallel computing

Google Classroom

One way to speed up some types of algorithms is to use parallel computing to spread the algorithm across multiple processors running simultaneously.

Sequential computing

The standard programming model is sequential computing: the computer executes each operation of the program in order, one at a time.

Consider this program that process images and reports how many are cats:

images ← [ "pet1.jpg", "pet2.jpg", "pet3.jpg", "pet4.jpg"]
numCats ← 0
FOR EACH image IN images
{
    foundCat ← detectCat(image)
    IF foundCat
    {
      numCats ← numCats + 1
    }
}

The program begins with a list of filenames and a variable to store the number of images of cats, initialized to 0. A loop iterates through the list, repeatedly calling a function that detects whether the image has a cat and updating the variable.

Here's a visualization of the program executing on 4 images:

Now let's analyze the program further, by annotating each operation with how many seconds it takes and how many times it's called:

Time	Calls	Operation
3	1	`images ← [ "pet1.jpg", "pet2.jpg", "pet3.jpg", "pet4.jpg"]`
1	1	`numCats ← 0`
1	4	`FOR EACH image IN images`
10	4	`foundCat ← detectCat(image)`
1	4	`IF foundCat`
2	4	`numCats ← numCats + 1`

Note: the values in the time column are made up, the actual time would vary based on the implementation of this pseudocode and the hardware running it.

We can calculate the total time taken by multiplying the number of seconds for each operation by the number of calls and summing the results. For this program, that'd be:

(3 \times 1) + (1 \times 1) + (1 \times 4) + (10 \times 4) + (1 \times 4) + (2 \times 4) = 60

This timeline visualizes the time taken by the computer:

Parallel computing

The sequential model assumes that only one operation can be executed at a time, and that is true of a single computer with a single processor. However, most modern computers have multi-core processors, where each core can independently execute an operation.

Multi-core processors can take advantage of parallel computing, a computational model that breaks programs into smaller sequential operations and performs those smaller operations in parallel.

Can we modify the cat detection program so that some of its operations can be executed in parallel? Only if there are operations that are not dependent on each other.

🤔 Take a look at the program again and see if you can identify operations that don't need to happen sequentially.

For this program, we could parallelize the operations inside the loop, since the detectCat function does not depend on the results from other calls to detectCat. That means telling the computer that these lines of code can be executed in parallel:

    foundCat ← detectCat(image)
    IF foundCat
    {
      numCats ← numCats + 1
    }

The exact mechanisms for parallelizing a program depend on the particular programming language and computing environment, which we won't dive into here.

Here's a visualization of the program executing those operations in parallel:

Now how long will the program take to execute? Remember the analysis from before:

Time	Calls	Operation
3	1	`images ← [ "pet1.jpg", "pet2.jpg", "pet3.jpg", "pet4.jpg"]`
1	1	`numCats ← 0`
1	4	`FOR EACH image IN images`
10	4	`foundCat ← detectCat(image)`
1	4	`IF foundCat`
2	4	`numCats ← numCats + 1`

The non-parallelized operations take the same amount of time as before:

(3 \times 1) + (1 \times 1) = 4

The time taken by the parallelized operations depends on the number of parallel processors executing operations at once. If I run this program on a computer with 4 cores, then each core can process one of the 4 images, so the parallel portion will only take as long as a single images takes:

1 + 10 + 1 + 2 = 14

This timeline visualizes the time taken by a computer that splits the program across 4 processes:

Comparing sequential vs. parallel execution

One way to evaluate the benefits of parallelizing a program is to measure the speedup: the ratio of the time taken to run the program sequentially to the time taken to run the parallelized program.

Our example cat detection program takes

60

seconds to run sequentially. When run in parallel on four processors, with each image requiring

14

seconds, the program takes

18

seconds to run. We calculate the speedup by dividing

60

18

60 / 18 = 3. \overset{―}{33}

We do not achieve a speedup of exactly

4

, despite using that many processors. The initial sequential portion still takes the same amount of time even if we have infinitely many processors. So, this initial sequential portion limits the maximum speedup of parallel computing.

Parallelized programs typically run on input sizes in the thousands or millions. If we ran this program on thousands of images, the initial sequential part would take a tiny fraction of time compared to the parallelized part.

Check your understanding

When run in parallel on two processors, with each image requiring

14

seconds, the program takes

32

seconds. What is the speedup?

Variable execution times

The operations inside the parallelized program segments may not always take the same amount of time, especially if they process program input.

For example, the detectCat function in our example may be using an algorithm whose time varies based on the image size or complexity. The operation may only take

10

seconds on one image but

22

seconds on another.

This timeline visualizes that situation:

The total time taken has now increased to 30 seconds, as the parallel portion takes as long as the longest of the parallelized operations.

Check your understanding

The original fully sequential program took 60 seconds. If the parallelized program takes 30 seconds, what is the speedup?

To compensate for the fact that there's often variation in execution times, the program can be clever about how it sends work to processors. Instead of pre-allocating tasks to processes, it can just send the next task to whichever processor is ready and waiting.

For example, imagine we ran the cat detection program on seven images and the detectCat() operation took a long time on the third image. The program doesn't need to wait on that to complete before it analyzes the seventh image; it can send that image to a free processor instead:

Check your understanding

When run on seven images, the sequential program takes 102 seconds. If the parallelized program takes 32 seconds, what is the speedup?

(You can enter the speedup as a fraction)

🙋🏽🙋🏻‍♀️🙋🏿‍♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

Want to join the conversation?

Sort by:

Chawin S.
Posted 4 years ago. Direct link to Chawin S.'s post “How is a program parallel...”
How is a program parallelized?
Button navigates to signup pageButton navigates to signup page
(5 votes)
Answer
- pamela ❤
  Posted 4 years ago. Direct link to pamela ❤'s post “Great question, it depend...”
  From the author:Great question, it depends on the programming language and environment. What programming languages are you familiar with?
  Comment on pamela ❤'s post “Great question, it depend...”
  (12 votes)
Iggy Belton
Posted 2 years ago. Direct link to Iggy Belton's post “How do you make the cat d...”
How do you make the cat detection program?
Button navigates to signup pageComment on Iggy Belton's post “How do you make the cat d...”
(5 votes)
Answer
- Jme
  Posted a year ago. Direct link to Jme's post “Likely with AI image reco...”
  Likely with AI image recognition.
  I'd recommend this series if you want to learn more :D
  https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
  (don't click the link, copy the URL into the address bar)
  Button navigates to signup page
  (0 votes)
005319114naw
Posted 4 years ago. Direct link to 005319114naw's post “Which programing language...”
Which programing language is best for parallel computing?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- KLaudano
  Posted 4 years ago. Direct link to KLaudano's post “It's hard to point to a s...”
  It's hard to point to a single, best language. It really depends on what you want to do. For low-level concurrent programming, you could use C/C++ or Rust. For high-level concurrent programming, you can try Python or Go.
  Button navigates to signup page
  (5 votes)
rachel
Posted 4 years ago. Direct link to rachel's post “How come in the "Sequenti...”
How come in the "Sequential Computing" section, it says the "numCats ← numCats + 1" operation is called 4 times even though there was only 1 cat in the example? Doesn't the code automatically stop/exit at "IF foundCat" line for images where a cat isn't detected?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- Shane McGookey
  Posted 4 years ago. Direct link to Shane McGookey's post “The code will not automat...”
  The code will not automatically stop/exit; rather, the code will evaluate the conditional expression as false and will then continue to the next iteration of the loop without executing "numCats ← numCats + 1".
  
  Also, it seems that the two examples are disparate. If the timetable given related to the example above it, then you are right in your assumption that "numCats ← numCats + 1" would only execute once, but in this case the example and table seem to be separate entities.
  Button navigates to signup page
  (5 votes)
Sophie Khan
Posted 2 years ago. Direct link to Sophie Khan's post “On the "check your unders...”
On the "check your understanding," how is the time taken to run the program sequentially 60 seconds? My understanding was that the initial sequential part which is unchanged in both programs is 18 seconds (32 total sec parallel - 14 sec on the images parallel), and since there are two images to be processed sequentially, each taking 14 seconds, the total time taken to run the program sequentially would be 18 + 2(14) = 46 seconds. Please correct me if I'm wrong.
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- KLaudano
  Posted 2 years ago. Direct link to KLaudano's post “The check your understand...”
  The check your understanding is based on the cat detection program written above it which was shown to take 60 seconds to run sequentially. When parallelized on two processors, the time is equal to 4 (for the array initialization and variable assignment at the beginning) + 2 (for 4 images split between 2 processors) * 14 (the amount of time to process one image) = 32 seconds.
  Button navigates to signup page
  (3 votes)
jayeshjakka0
Posted 2 months ago. Direct link to jayeshjakka0's post “Hey Khan Academy, Please ...”
Hey Khan Academy, Please bring video or animation videos to explain these types of concepts. Sometimes articles are difficult to digest..!
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
Engles, Michael
Posted 6 months ago. Direct link to Engles, Michael's post “what is parallel programi...”
what is parallel programing
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- P1_Mustaq_Ra'ed
  Posted 5 months ago. Direct link to P1_Mustaq_Ra'ed's post “This article says, "A com...”
  This article says, "A computational model that breaks programs into smaller sequential operations and performs those smaller operations in parallel." It's pretty much just breaking a task into smaller parts and dividing it up to make the task finished faster.
  Button navigates to signup page
  (2 votes)
Sam
Posted 3 months ago. Direct link to Sam's post “How can we speed up paral...”
How can we speed up parallel computing even more?
Button navigates to signup pageComment on Sam's post “How can we speed up paral...”
(1 vote)
Answer
2023hildebrandtzacharye
Posted a year ago. Direct link to 2023hildebrandtzacharye's post “How does a parallel work?”
How does a parallel work?
Button navigates to signup pageButton navigates to signup page
(0 votes)
Answer
- santiagobeltran
  Posted 4 months ago. Direct link to santiagobeltran's post “Robert J. Oppenheimer wan...”
  Robert J. Oppenheimer wants to solve some special equations. He can sit down and do them all, or let several other people do chunks of it. This only works when each chunk has the same steps, but different parameters.
  Button navigates to signup page
  (2 votes)