Multithreading in C [Question]
#1
I'm a total noob in this area of programming, I've searched on StackOverflow but the answers weren't satisfying.

So if I had a program that for example had to delete files how would I go about doing that with multithreading?

A one thread program deletes one by one, I would need mine to distribute files to lets say 4 threads and when one thread is finished add a new file for the thread to work on.

How do I go about doing this?

I've looked into the concept of thread pool but I don't have a clue on how to implement it.

Any information is welcome.
Reply
#2
Linux or Windows?
The API/libraries/headers you use is different for each of them, unless you're able to use C++ and just use boost or something.
Reply
#3
(11-25-2020, 11:53 PM)poppopret Wrote: Linux or Windows?
The API/libraries/headers you use is different for each of them, unless you're able to use C++ and just use boost or something.

Windows, however I prefer the POSIX functions if it can be done in that way
Reply
#4
Unfortunately, pthreads doesn't work properly on Windows, and you also won't have access to other syscalls like fork() that can spawn new threads. I know MinGW has a pthreads-win32, but I don't know how well it's supported or how stable it is; I'd suggest just sticking to VisualStudio and MSVC.

Windows multithreading is a little more difficult just because of all its proprietary structures and whatnot, but the logic behind multithreading is all the same.

Essentially, a thread is just a function call that happens in a child process of the main application. In your case, to delete files, each thread would just call your file deletion method of choice, then close.

Some pseudocode to illustrate:
Code:
for (int i = 0; i < 4; ++i) {
    _beginthread(delete(file));
}

Extremely simplified, don't know if delete() is even a valid function, just used to illustrate how you can multithread deleting files.

But, this only deletes four files, not all the files in a list. In that case, you'd first need to split up the list of files (I'm assuming you have a char** or something that has a list of files you want to delete) into four groups (or however many groups as you have threads), then write a function that will loop through all the files, delete them, and pass that function to each thread.

Now, this is another simple way to do things, and avoids race-conditions, or weird/undefined interactions when two threads simultaneously try to access/change something. If you just passed the whole list to each thread, you'd encounter all sorts of bad behavior with the threads, probably a few segfaults, etc. just because the threads can't find files that other threads have deleted.

That's solved with the concept of a mutex to 'lock' certain data until the thread is done operating on it. That way, you can push each list of files to the threads and if there's a mutex locking one file, move onto the next one.

For actual functions, there's two regular functions you can use to create threads that isn't CreateThread().

All Windows threads have a handle, as Windows does with WinAPI applications. If you've played around with OpenProcess or something of that sort, then you already know what I'm talking about. You can use that handle to control a thread, but in Windows, which thread-starting function you use will affect how simple this is.

_beginthread() on Windows is just a regular thread with a function pointer passed to it. Ends when the function execution ends. Returns 0 on success/-1 on failure. Since it exits so much faster, when you return that handle, it might already be invalid by the time you use it. Or, it might point to another thread.

But _beginthreadex() is a little more powerful. It doesn't exit immediately/must be ended with _endthreadex(). It offers some more powerful in inheriting security parameters from its parent process. It returns 0 on failure. You can suspend that thread as its initial state. Since it needs to be killed with another function, you will always get a valid handle back.


Now, onto the last option, or what you wanted to try building: thread pools.
Can't say I've done much with this, but MSDN has a decent bit of documentation and a pretty simple code sample for creating a thread pool: https://docs.microsoft.com/en-us/windows...-functions
It doesn't do much, but uses pretty much all the functions needed to getting/setting all the data for the thread pool data structure, and using it/cleaning it up once done. It might be easier to do this in C++ where you can take advantage of OOP, but it's all C-compatible in any case.

Thread Pools are pretty advanced, so if you're still struggling to figure it all out, I'd suggest trying out the above possibilities first just to get a feel for multithreading and other possible solutions to your problem.
Reply
#5
(11-26-2020, 06:16 PM)poppopret Wrote: Unfortunately, pthreads doesn't work properly on Windows, and you also won't have access to other syscalls like fork() that can spawn new threads. I know MinGW has a pthreads-win32, but I don't know how well it's supported or how stable it is; I'd suggest just sticking to VisualStudio and MSVC.

Windows multithreading is a little more difficult just because of all its proprietary structures and whatnot, but the logic behind multithreading is all the same.

Essentially, a thread is just a function call that happens in a child process of the main application. In your case, to delete files, each thread would just call your file deletion method of choice, then close.

Some pseudocode to illustrate:
Code:
for (int i = 0; i < 4; ++i) {
    _beginthread(delete(file));
}

Extremely simplified, don't know if delete() is even a valid function, just used to illustrate how you can multithread deleting files.

But, this only deletes four files, not all the files in a list. In that case, you'd first need to split up the list of files (I'm assuming you have a char** or something that has a list of files you want to delete) into four groups (or however many groups as you have threads), then write a function that will loop through all the files, delete them, and pass that function to each thread.

Now, this is another simple way to do things, and avoids race-conditions, or weird/undefined interactions when two threads simultaneously try to access/change something. If you just passed the whole list to each thread, you'd encounter all sorts of bad behavior with the threads, probably a few segfaults, etc. just because the threads can't find files that other threads have deleted.

That's solved with the concept of a mutex to 'lock' certain data until the thread is done operating on it. That way, you can push each list of files to the threads and if there's a mutex locking one file, move onto the next one.

For actual functions, there's two regular functions you can use to create threads that isn't CreateThread().

All Windows threads have a handle, as Windows does with WinAPI applications. If you've played around with OpenProcess or something of that sort, then you already know what I'm talking about. You can use that handle to control a thread, but in Windows, which thread-starting function you use will affect how simple this is.

_beginthread() on Windows is just a regular thread with a function pointer passed to it. Ends when the function execution ends. Returns 0 on success/-1 on failure. Since it exits so much faster, when you return that handle, it might already be invalid by the time you use it. Or, it might point to another thread.

But _beginthreadex() is a little more powerful. It doesn't exit immediately/must be ended with _endthreadex(). It offers some more powerful in inheriting security parameters from its parent process. It returns 0 on failure. You can suspend that thread as its initial state. Since it needs to be killed with another function, you will always get a valid handle back.


Now, onto the last option, or what you wanted to try building: thread pools.
Can't say I've done much with this, but MSDN has a decent bit of documentation and a pretty simple code sample for creating a thread pool: https://docs.microsoft.com/en-us/windows...-functions
It doesn't do much, but uses pretty much all the functions needed to getting/setting all the data for the thread pool data structure, and using it/cleaning it up once done. It might be easier to do this in C++ where you can take advantage of OOP, but it's all C-compatible in any case.

Thread Pools are pretty advanced, so if you're still struggling to figure it all out, I'd suggest trying out the above possibilities first just to get a feel for multithreading and other possible solutions to your problem.

Wow I did not expect such a detailed reply. Thank you, this is extremely helpful.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [Question] dev 1 20,990 04-02-2021, 10:55 PM
Last Post: jean_valjean