Forgive me for I am not an expert in multi-threading by any means and need some assistance. So just some given knowledge before I get to my question:
Pre-Knowledge
- Developing C++ code on the Jetson TK1
- Jetson has 4 CPU cores (quad-core CPU ARMv7 CPU)
- From what I have researched, each core can utilize one thread ( 4 cores -> 4 threads)
- I am running a computer vision application which uses OpenCV
- Capturing frames from a webcam as well as grabbing frames from a video file
Psuedo-Code I am trying to optimize my multi-threaded code such that I can gain the maximum amount of performance for my application. Currently this is basic layout of my code:
int HALT=0;
//Both func1 and func2 can be ran parallel for a short period of time
//but both must finish before moving to the next captured webcam frame
void func1(*STUFF){
//Processes some stuff
}
void func2(*STUFF){
//Processes similar stuff
}
void displayVideo(*STUFF){
while(PLAYBACK!=DONE){
*reads video from file and uses imshow to display the video*
*delay to match framerate*
}
HALT=1;
}
main{
//To open these I am using OpenCVs VideoCapture class
*OPEN VIDEO FILE*
*OPEN WEBCAM STREAM*
thread play(displayVideo, &STUFF);
play.detach();
while(HALT!=1){
*Grab frame from webcam*
//Process frame
thread A(func1,&STUFF);
thread B(func2,&STUFF);
A.join();
*Initialize some variables and do some other stuff*
B.join();
*Do some processing... more than what is between A.join and B.join*
*Possibly display webcam frame using imshow*
*Wait for user input to watch for terminating character*
}
//This while loop runs for about a minute or two so thread A and thread
//B are being constructed many times.
}
Question(s) So what I would like to know is if there is a way to specify which core/thread I will use when I construct a new thread. I fear that when I am creating threads A and B over and over again, they are jump around to different threads and hampering the speed of my system and/or the reading of the video. Although this fear is not well justified, I see very bizarre behavior on the four cores when running the code. Typically I will always see one core running around 40-60% which I would assume is either the main thread or the play thread. But as for the other cores, the computational load is very jumpy. Also throughout the application playing, I see two cores go from around 60% all the way to 100% but these two cores don't remain constant. It could be the first, second, third, or even fourth core and then they will greatly decline usually to about 20->40%. Occasionally I will see only 1 core drop to 0% and remain that way for what appears to be another cycle through the while loop(i.e. grab frame, process, thread A, thread B, repeat). Then I will see all four of them active again which is the more expected behavior.
I am hoping that I have not been too vague in this post. I just see that I am getting slightly unexpected behavior and I would like to understand what I might be doing incorrectly or not accounting for. Thank you to whomever can help or point me in the right direction.