Help understanding semaphores and producer/consumer problem
October 16, 2010 3:45 PM   Subscribe

Help understanding semaphores and producer/consumer problem (more inside..)

I'm having a really hard time trying to understand how to write a program to address the following problem:

Program is intended to address the producer/consumer problem by producing sequential integers and then consuming them by printing to the screen. Should run as an infinite loop and never deadlock.

Ultimately I have to create my own data type for a semaphore and syscalls for up() and down() but I thought the best way to start would be to first use existing syscalls. However, I can't find existing syscalls for up() and down() that will allow me to test this. Any suggestions? Also, do I need to fork of the calls to the producer and consumer functions so that they are both running at once? I'm having a hard time wrapping my head around this.

My very incomplete code
posted by Raichle to Computers & Internet (29 answers total) 2 users marked this as a favorite
Okay, so, you'll have two threads of execution. We'll call them A and B.

A's job is to create sequential integers. That's totally straightforward. It'll create one integer, put it in the queue, and flip the semaphore.

B's job is to sit and wait on the semaphore. When he sees it flip, he grabs the latest number in the queue, and prints it.

Both A and B run in tight loops, creating and adding; taking, and printing.

Now, you understand the concept of blocking, yes? That is, a process will continue along until it hits a system call (or other operation) that it cannot immediately complete. It will then (automatically) yield execution to the OS for scheduling other processes. The OS then reawakens it when the conditions exist to continue. Sometimes this is not what you want (realtime network IO), but in your case, this is exactly what you want.

So, thread A will try to acquire the semaphore. If the semaphore is available, then it immediately locks it, and does its job before unlocking the semaphore on the way out. If the semaphore isn't available, thread A blocks.

Thread B will try to acquire the semaphore. If it's available, it will grab an integer from the queue before unlocking the semaphore. Likewise, if the semaphore isn't available, B will block.

Notice that both A and B unlock the seamphore when they're done with their operations. This is because unlocking a semaphore will awaken other threads blocking on that semaphore.

How you ultimately implement this is dependent on what threading model, if any, is specified. You can complete your assignment using traditional UNIX designs, with shared memory (man shmget) and UNIX semaphores (man semget). Or, you can use POSIX threads, and pthread_mutex.

In both cases, you'll have a main thread (or process) and a child thread (or process). The most straightforward way of organizing these threads is that the main thread creates the child thread, and then goes on to execute the code for A. This is much easier than having the main thread create A and B and then wait for them to exit--if the main thread exits, they children will be terminated.

All in all, using posix threads is much easier for student code like this. This is because threads automatically share a single memory space. So, you can have a simple global reference to a SynchronizedQueue object (or struct), and both the A and B threads can simply reference it without having to go through the annoyance of shared memory and memory mapping.

For the implementation... You can do this flawlessly with very little work.

Create a queue object (or struct and helper functions). Create enqueue() and a dequeue() methods. Then, your code for synchronization is like:
void SyncQueue::enqueue(int item){

int SyncQueue::dequeue(){
    int retval = this->backing_store.pop_back();
    return retval;
A and B's functions will essentially come down to this:
static int counter = 0;
static SyncQueue q;
void* A(void *not_used){
    for (;;)

void* B(void *not_used){
    for (;;)
        printf("%d", q.dequeue());
Eventually you can replace the pthread_mutex_lock() calls with the x86 instruction inline assembler necessary for implementing a sempahore.
posted by Netzapper at 4:12 PM on October 16, 2010 [2 favorites]

Oh, oops. It dawns on me that you'll also need the threads to wait (voluntarily block on something) if the queue is either totally full or totally empty.

You need a mutex and also a condition variable.

Here, really, you need a condition variable and/or monitor.
posted by Netzapper at 4:19 PM on October 16, 2010 [1 favorite]

[ First of all just a bit of terminology -- you seem to be using the word syscall to refer to any function, but it has a specific meaning as a routine that calls into the operating system kernel. They are usually numbered because there is usually only a single entry point to kernel mode from usermode, so the desired syscall is passed as a number. Examples: linux, windows. ]

You don't have to fork, but you do have to have the producer and consumer running at the same time. You can do that with fork(), or with pthreads, or with whatever kind of threads your platform supplies. The thing you are conspicuously missing in your code are the locking semaphores. The whole point of this exercise is to understand that the reader and writer could both access your 'bufPtr' variable at the same time, which would be disastrous. You have to use locking primitives to ensure that only one or the other can read or write it at any given time. Also, you are lacking any kind of bounds checking, so you will overflow your buffer eventually which is not a good thing.
posted by Rhomboid at 4:22 PM on October 16, 2010

OK, I really appreciate your response, but I'm not sure I see how it helps my dilemma. First, I have to do this using processes and a shared buffer. The code I linked to creates that buffer in shared memory using mmap. I think I have the basic ideas of producing and consuming coded in there (minus the count), but I'm not sure how to implement the semaphore since I don't have the syscall for up() and down(). Ultimately I have to write them, but I was looking for examples or explanations of what they do. Preferably, I was looking for some built-in way to do this so that I could test my existing code as I go along before writing the syscall.

Also, my understanding is that the semaphore is intended to also store some kind of information about queued wakeups though I'm not completely clear on that. Any links with examples of these syscalls and semaphore usage would be wonderful.

You can see what I'm working with by viewing the code I have so far.

Thanks again!
posted by Raichle at 4:25 PM on October 16, 2010

That was directed to NetZapper
posted by Raichle at 4:26 PM on October 16, 2010


Concerning syscalls, I understand what it means. The constraints of the assignment require creating new syscalls for up() and down() and adding them to a modified kernel. I was looking for a way to use something existing for testing before even tackling how to write these. I also can't figure out exactly what they do.

Concerning semaphores: Yes, my code is missing them. I'm a little confused about where and how exactly to implement them, especially since I don't understand how I am to use up and down. Maybe I'm missing something key here.

I'll implement the buffer bounds checking, have a note to do it, just got distracted. Thanks!
posted by Raichle at 4:29 PM on October 16, 2010


Just to be clear, you are saying I need to (specifically) fork off the fork_producer and fork_consumer processes so that they will both be running at once. Is this correct?
posted by Raichle at 4:30 PM on October 16, 2010

I'm a bit confused about what you're doing. It appears that your assignment is to extend some operating system kernel with semaphore syscalls and to write a program to demonstrate that your semaphores work. And you're now trying to get the demonstration code working on some other operating system. Is that correct?
posted by reynaert at 5:24 PM on October 16, 2010

Well, it's not really demonstration code since it's part of the assignment. We have to A. implement the producer/consumer problem on sequential ints and B. Create semaphore syscalls as you described. I was hoping for some way to test A. before starting B. and some helpful info about how to tackle B. I added a basic syscall to print hello world to the kernel successfully so I understand about that much (files to modify, etc). I don't understand how to implement the semaphore syscalls and was hoping somebody could help me shed light on this with links or general info.
posted by Raichle at 5:33 PM on October 16, 2010

Yeah, I'm really confused now. All operating systems already have existing syscalls to create and manage many different kinds of locking primitives. Why would you need to modify a kernel for that? For example on Linux to do IPC you could use the SysV semaphore interfaces (sem_{init,post,wait,timedwait,...}) or you could use futexes or probably about 3 other APIs; most of the time you'd use pthreads but if your thread model constraint is separate-process then that won't work. Maybe explain a little more what exactly you're doing because it's not that clear. As to forking, you only need to fork once -- that gives you two processes running simultaneously, the parent and the child, so for example you could do the producer in the parent and the consumer in the child.
posted by Rhomboid at 6:01 PM on October 16, 2010

It would probably help to know which toy OS your instructor is using, as they're not all POSIX, and thus not necessarily portable from your modified kernel to Linux. I'm guessing you're a student at CMU?

Under Debian, the kernel manpages package provides a syscalls manual page that basically lists everything, and nothing semaphor related is available. I see a sem.h, that's probably implemented in user space on top of mutexes. The more common Linux threading stuff is pthreads, because directly interfacing with a kernel syscall interface is madness.

As far as producer/consumer and semaphores, it's pretty easy. Use a semaphore to represent the number of tokens available for consumption. The other trick is figuring out how to avoid the producer from overrunning the buffer, but that'll be easier to figure out once you're done the first bit.
posted by pwnguin at 6:07 PM on October 16, 2010

Also, major hint: fork is not what you want for shared memory producer consumer. Go back to your books and read up the difference between a process and a thread.
posted by pwnguin at 6:08 PM on October 16, 2010

(I didn't preview.) Okay, so what exactly are the semantics of these syscalls you're supposed to implement? Do they just atomically increment/decrement a word in userspace? If you haven't ever used a semaphore/mutex API before how are you supposed to design one? This sounds like an astoundingly bad assignment.
posted by Rhomboid at 6:09 PM on October 16, 2010

fork is not what you want for shared memory producer consumer. Go back to your books and read up the difference between a process and a thread.

OP already said that the assignment was to use separate processes, so a threaded solution wouldn't be acceptable.
posted by Rhomboid at 6:12 PM on October 16, 2010

pwnguin: His code already uses mmap to create shared memory.
posted by reynaert at 6:16 PM on October 16, 2010

Also, Linux has two sets of semaphore syscalls:
  • semget, semop, semctl for System V semaphores
  • futex, which is used for POSIX semaphores and pthreads

posted by reynaert at 6:22 PM on October 16, 2010

I should clarify that futexes aren't semaphores, but they can be used to implement semaphores. See the references in the Wikipedia article.
posted by reynaert at 6:27 PM on October 16, 2010

Ah, skimmed over that mmap bit.
posted by pwnguin at 6:37 PM on October 16, 2010

Abstract: the semaphore is an opaque object that tracks a count of how many widgets are in a shared buffer.

Calling down() in a consumer decrements the count by 1, but if the count is currently 0 it waits until at one producer adds at least one object.

Calling up() in a producer notifies the consumers that at least one object has been added. If any consumers are waiting for an object to be added, they will be woken up and at least one of them will be allowed to consume.

Assumption: you haven't told us what toy kernel you're playing with. This pseudo code assumes that the kernel is not multithreaded, or at the very leaste protected by a big kernel lock. If it is multithreaded... well, you'll have to use the kernel synchronization primitives to protect some of these data structures. Also assumes that the kernel is not interruptible.

Syscall Implementation:
hash<string, int> semaphores;
sem_down( string name ) {
  int count = 0;

  if (!semaphores.exists( name ))
    sem_create( name );

  # wait for a producer to add an object
  while ((count = semaphores.get( name )) <= 0)

  # consume something
  semaphores.set( name, count - 1 );

sem_up( string name ) {
  int count = 0;

  if (!semaphores.exists( name ))
    sem_create( name );

  count = semaphores.get( name );
  semaphores.set( name, count + 1 );
Now, that is not a very efficient implementation in that it waits for a process to be scheduled before it re-checks if something has been added to the semaphore. We could extend the implementation a bit which might increase performance on systems with lots of processes doing work:
struct semaphore {
  volatile int count;
  list<pid_t> waiters;
hash<string, semaphore> semaphores;
sem_down( string name ) {
  int count = 0;
  semaphore s = nil;

  if (!semaphores.exists( name ))
    sem_create( name );

  # wait for a producer to add an object
  s = semaphores.get( name );
  s.waiters.add( process_self() );

  while (s.count <= 0)

  # consume something
  s.waiters.remove( process_self() );

sem_up( string name ) {
  int count = 0;
  semaphore s = nil;

  if (!semaphores.exists( name ))
    sem_create( name );

  s = semaphores.get( name );
  foreach (pid_t waiter in s.waiters)
    process_schedule( waiter );
But like I said, this assumes you have an uninterrupted single threaded kernel. Which you might. In the other cases your kernel better have an interface to mutexes and conditions or you're dead in the water.
posted by sbutler at 7:47 PM on October 16, 2010

Please, people, STFU if you're just guessing and don't know what you're talking about. If your answer doesn't involve semaphores and/or has yield() or manual process scheduling, I'm talking to you.

A semaphore is two things:
- a counter that provides atomic increment/decrement, and
- a means of blocking your process when the counter tries to go below zero and then unblocking your process when the counter comes back up.

So, the classic producer/consumer solution has these parts:
- a producer, which sits in an infinite loop, trying to insert items into the buffer
- the buffer, usually a "bounded buffer", i.e. a fixed-size array
- a consumer, which sits in an infinite loop, trying to pull items from the buffer

The interesting part is the buffer of course. To implement that, what you need is a pair of semaphores: one indicates how many valid items are in the buffer and one indicates how many slots are free in the buffer. You arrange your code so that an attempt to insert an item wait()s on the free-counter semaphore, i.e. it must block if the bounded buffer is full, then it inserts an item and notify()s the valid-items semaphore, thereby possibly waking up the consumer.

You only need mutexes in addition to the semaphores if you want to support multiple producers and/or consumers; those mutexes act to ensure atomiticy of insertion and removal of items into the buffer.

I'll assume you know how to build a circular buffer using an array; where I write insert() and retrieve(), those are circular-buffer operations.

Some pseudocode... you initialise your two semaphores to the appropriate initial values indicating that the buffer is empty:

semaphore freespace(buffersize);
semaphore validspots(0);

This method is called in an infinite loop by your producer, with increasing values:

void produce(int value)
    freespace.wait(); // attempt to decrement, block if full
    validspots.notify(); // indicate extra item is available

And this method is called in an infinite loop by consumer thread:

int consume()
    int value;
    validspots.wait(); // wait until it's not empty
    freespots.notify(); // indicate that more room is now available
    return value;

And that is all. Of course, you still need to implement the bounded buffer, forking, memory allocation, etc, etc, but that's details. Prototype it first using POSIX semaphores and then if the latter part of the assignment requires you to do so, write your own - that is very operating-system dependent as it requires the ability to block a process and wake it up later on a condition variable.

If you have multiple producers and consumers, you need to wrap a mutex around the buffer.insert() and buffer.retrieve() calls.

If you're just lost looking for an existing semaphore implementation that you can build your bounded buffer on, google for "posix semaphore", e.g. here's a primer that uses a semaphore to implement mutex functionality.
posted by polyglot at 4:43 AM on October 17, 2010

Please, people, STFU if you're just guessing and don't know what you're talking about. If your answer doesn't involve semaphores and/or has yield() or manual process scheduling, I'm talking to you.

His kernel doesn't have semaphores, and it's part of the assignment to add them. Hence all my talk about yield and manual scheduling. My pseudo code wasn't to solve the userland problem...
posted by sbutler at 1:40 PM on October 17, 2010

My bad, sbutler.

However, I think semaphores (and the producer/consumer design pattern!) are generally much more useful when you have pre-emptive multitasking and even better, multiple CPUs. In that case, you can have one CPU flat-out producing and one consuming and neither should ever wait unless the other can't keep up. If you have cooperative multitasking, there isn't really a need for producer/consumer or semaphores.

So when you write your semaphore, you don't want to yield() - there is often no yield() method in a pre-emptive scheduling system. In down()/wait(), you want to insert the current thread (if it is to be blocked) into a queue for that semaphore and then set the thread to sleep, i.e. make it unschedulable. And then in up()/notify(), you go inspect the queue of sleeping threads and wake one or all of them.

Of course, you need to make sure that your counter and sleeper-queue are protected against concurrent mutation - a mutex is the obvious approach but you should be careful to ensure that it doesn't unnecessarily reduce concurrency by forcing all producers and all consumers to go through the one lock, i.e. only lock the sleeper-queue, only access the sleeper-queue when necessary and use a separate atomic-add operation to manipulate the counter.

A good implementation should obviously be safe (and that alone is difficult), but should also not require any thread to block on any mutex (including the one inside the semaphore operations!) unless the bounded buffer is either empty or full, i.e. one of the semaphores hits zero.
posted by polyglot at 4:51 PM on October 17, 2010

To expand on that last sentence, if you have your producers and consumers both requesting the same mutex inside the semaphore functions on every access then your implementation is no better than wrapping a mutex around the whole buffer. All concurrency of access would lost: you would either be producing or consuming a value at any given time but not both. The point of a good semaphore-based bounded buffer is that you can do both concurrently.

You can't produce and produce or consume and consume concurrently though unless you want to discard all notions of item ordering within the buffer.
posted by polyglot at 4:57 PM on October 17, 2010

Sorry for not specifically answering your base problem. You asked about code that would show correctness of a partial solution by using existing locking primitives. I answered that, with the assumption that you were using a relatively normal linux kernel of some sort. Do 'man semget', and you'll get the manual page for the SysV semaphore implementation. It's trivial to use, and gets you through testing.

I also illustrated the usage of a queue using locking and unlocking. All that's left is the implementation of the locking primitives themselves. That's your assignment, as I understood it.

I did use mutexes instead of semaphores because, 'cause I tend to use POSIX and not SysV stuff.

I didn't realize you were in kernel space, as the problem is far more interesting in userland where there's 'real' threading. If syscalls are serialized by single-threaded (non-preemeptive) or GIANTLOCKED kernel, or by default on a single-core chip, then implementing the syscalls in kernel space is as simple as incrementing a variable. And, I pointed you toward the easiest x86 assembler instruction for implementing a mutex, but read up on the xchg, and lock instructions for how you can mutex off a critical section and implement your semaphore.

But, I'm being intentionally pedantic. You're in school. We've given you lots of hints, use your noggins.
posted by Netzapper at 5:07 PM on October 17, 2010

However, I think semaphores (and the producer/consumer design pattern!) are generally much more useful when you have pre-emptive multitasking and even better, multiple CPUs. In that case, you can have one CPU flat-out producing and one consuming and neither should ever wait unless the other can't keep up. If you have cooperative multitasking, there isn't really a need for producer/consumer or semaphores.

Again, my code didn't say anything about preemption or co-operative multitasking in userland. We're talking syscall and kernel mode here. And while he could write a userland implementation of semaphores in a multithreaded program, he tells us the assignment requires the consumer and producer to be in separate processes.

I suppose you could still write a multiprocess semaphore implementation in userland using signals and shared memory (hell, glibc pthreads used to be userland) but it's much easier if you do it in kernel mode. And I think in the kernel it is entirely correct to be using yield() and schedule() type methods.
posted by sbutler at 5:26 PM on October 17, 2010

Well, I appreciate everybody's help in understanding this but it's now irrelevant. The professor came across this thread and decided that, even though I didn't ask for code or use any code from here, it was cheating. He's failing me on the project. I don't really understand why since I don't see this as any different than if I were to ask a tutor or the Computer Science help desk at my University. He's also claiming that since I had posted some of my code (very basic stuff), that if he thinks any other student derives their code from it, I will "further fail in the course". I don't know what that means but I guess I have no option but to get out of this class while I can. Frankly, if any other student hadn't gotten that far the day before the project was due (I was out with an illness, probably warranting an extension), then my code isn't going to be much help to them. I'd hate to point out how many students leave their code in their public or HTML directories on our AFS system (used to do support there). If you want to cheat, you'll cheat. I tend to think that the point is to learn the stuff and so I don't. I really didn't think that asking for help finding resources to understand this is cheating...
posted by Raichle at 10:46 AM on October 18, 2010

I'd appeal the cheating accusation, if I were you. I'd definitely drop the course, but I'd also appeal the decision if it's going to go on your record as academic dishonesty.

You want we should email him on your behalf?
posted by Netzapper at 2:10 PM on October 18, 2010

Hey, professor: if Raichle is representing you accurately above, you're wrong. And I say that as someone who spent a lot of time tutoring and marking computer science. None of the above is even remotely close to cheating.

Nothing in this thread is an answer to the assignment because it's all pseudocode and explication of algorithms and the meaning of concepts like semaphore and bounded-buffer. Nothing in this thread could be compiled or submitted; Raichle will still need to understand the explanations, read the semaphore function manpages and write a huge pile of code that is not present here.
posted by polyglot at 5:29 PM on October 18, 2010

I'm going to refrain from commenting on this further until I see how it plays out in the end. I am appealing. The argument is that collaboration is forbidden in the class policy description. I appreciate the support and I'm glad that other people see it my way. It really hurts to be accused of something like this when all I was doing was trying to figure out something I was struggling with and trying to learn. I don't feel like I've been provided the resources in this class to get the help to understand this. Thank you all again.
posted by Raichle at 6:05 PM on October 18, 2010

« Older It's not cauliflower. It only looks like...   |   Turn Left at ph'nglui mglw'nafh and then go... Newer »
This thread is closed to new comments.