Class 2 Hour 3 -Scribing by Mike Reed (reed10@ccs.neu.edu) Revisiting User vs superuser/privileged/kernel modes ----------------------------------------------------- Assume we are discussing a simple system using the base and bounds registers as discussed before. Memory ............ | | Process View | | of Memory ........ | | _,-+-----------.MAX |bounds| | | ,,' | | '`'''''--............-' | | |XXXXXXXXXX| | | |XXXXXXXXXX| | | |XXXXXXXXXX| _,.------------'0 |XXXXXXXXXX| _,-' .......,-'`''''''''''' |base| | | '`'''' | | | | | | `'''''''''' Using a base and bounds offset, we see a shrunken view of memory that can be indexed 0-MAX where 0 is the base offset and max is the bounds offset - base offset. a) For security, any attempt to access outside the range specified by base/bounds is not allowed. b) To ensure a), the base/bounds registers should not be able to be changed. This creates a chicken-and-the-egg problem: how do we first set the registers? Who can update these registers? a) Introduce 2 states, user (aka application) and kernel (aka superuser, privileged). The mode is indicated by a state (a bit) in the processor. We can't allow user mode to change directly into kernel mode, since then our protection (setting base/bounds) would be compromised. But, the special registers need to be changed somehow. b) Set up a protected gateway - an interrupt. Linux example) Linux, on file read, needs to access memory outside of our bounds. -Sends an interrupt to switch to kernel mode -Kernel mode handles interrupt, updates the memory bounds, and returns to user mode. -Important: the change in the bounds which control the memory accessibility is done in kernel mode. Questions (This part went fast, apologies for any gaps): -------------------------------------------------------- Q: How do file descriptor tables point to devices? A: The file descriptor tables contain a pointer to a table of function pointers. For example: struct file_ops{ read() --> pointer to function that has args to read file and return an int } This system of pointers can create many layers of indirection and be very complex. Q: How does the processor first get into kernel mode? Does the CPU start with the first instruction in kernel mode? A: Depends on OS. For Power PC: -Startup without virtual memory. -Set up page tables. -Set up virtual memory in kernel mode. Synchronization --------------- Multiple processes + preemption = race conditions. For this discussion, we will be talking about multiple THREADS within a single PROCESS. -Shared memory, code, data, etc. Bank Account Example) We have a function, deposit, which takes a deposit amount and adds it to our current balance. This function, shown on the left gets translated into the machine instruction on the right. fun deposit(amt): 1. MOV balance->R balance += amt => 2. ADD amt->R 3. MOV R->balance Currently, With 2 threads, t1, t2, it is possible for a call to 'deposit' in t2 to be completely lost: Current Balance = 100 t1 - deposit(50) 1. MOV 100->R 2. ADD 50->R ~~~~~~interrupt by t2~~~~~> t2 - deposit(75) 1. MOVE 100->R 2. ADD 75->R 3. MOVE R->balance 3. MOV R->balance <~~~~~~~done, resume t1~~ Current Balance: 150! Should be 225. The interrupt makes it seem that t2 never executed, resulting in an incorrect balance. Critical Section Problem ------------------------ Note: The book refers mostly to code, but its important to remember while the important sections are marked in code that its actually critical data that we must worry about. In our banking example, both threads were trying to update the balance of our account (data) at the same time. If there had been 2 separate accounts, the 2 threads would not have interfered. Therefore, it is not inherently critical blocks of code, but critical access to data. We will use a mutex to ensure mutual exclusion of access to the current balance: fun deposit(amt): mutex m +--m.lock() atomic--| balance += amt +--m.unlock() This code is atomic iff all sections of this same code (balance += amt) is locked by the mutex. The mutex locks a section of code. This ensures that only this thread can execute the code within the lock. So, the thread which holds the lock is the only thread that can update our balance. While this accomplishes our goal, this locks out deposits on all accounts. Deposits to other accounts should be allowed, since they do not interfere. Alternatively, we can use a mutex on the accounts themselves: _________ |_______ | |mutex |-| Account objects: |-------| | |balance|_| '-------' -Each account object has a mutex and a balance. this means we can lock individual accounts. -Again, this allows us to block sections of data, rather than code. In our example this allows multiple balances to be updated at the same time. Implementing a mutex -------------------- -Single CPU, possible ways of switching threads: 1) Thread calls something in the OS (not shown in our example) 2) Using an interrupt (shown in our example) Ways to solve our bank deposit problem: 1) Turning off interrupts: disable_interrupts() ... ...locked code... ... enable_interrupts() Pros: Guarantees the lock works. Cons: Ignoring interrupts could cause the CPU to miss I/O or other potentially important events/interrupts. 2) Implement a mutex as a structure with a boolean busy value. .................... |bool busy | mutex: |------------------| |controlBlock *wait| '`'''''''''''''''''' local var tmp fun lock(): disable_interrupts() if m.busy add to wait list (a linked list) tmp = True else m.busy = True tmp = False enable_interrupts() if tmp wait() If there is any thread waiting upon unlock, wake it up. Otherwise, unlock the mutex. Multi-core spin-lock -------------------- Multi-core processor: ...... ...... |CPU | |CPU | '`,''' '`'/'' `....,' |MEM| '`''' -2 CPUs (P1, P2) accessing same memory -Disabling interrupts no longer works -Must use a hardware instruction to switch between CPUs; exchange instruction; XCHG: XCHG R, P1 P2 REG MEM REG <---------. | A------------. B | ) Atomic exchange: B<-----------' B P2 can only access memory before or after the exchange instruction <---------- Spin-lock example) P1 Lock P2 R=1 0 . . . . . . XCHG R, lock 0 1---------------. 0 send 1 ) 0<--------------' 1 get 0 send 1 1 .---------------1 ----+ ( get 1 | 1 '-------------->1 | send 1 | 1 .---------------1 | Bad! P2 is just spinning ( get 1 | while waiting for the 1 '-------------->1 | lock to be released. send 1 | 1 .---------------1 | ( get 1 | 1 '-------------->1 ----+ XCHG R, unlock 0---------------. 1 send 0 ) 1<--------------' 0 get 1 send 1 0 .---------------1 ( get 0 1 '-------------->0 This is clearly inefficient, but it is correct. However, when used in conjunction with thread scheduling, it can be more efficient. The thread would detect a spin lock, and sleep opening up the CPU to another thread, and be scheduled to check if the lock still exists at a later time. Note about pintos linked lists: -C Doubly linked lists -Objects can live on multiple lists -Read the documentation at the top of list.h. Bonus: Note: This was not said explicitly in class by a professor but this is my working, if not 100% correct, view of the linked lists in pintos. Buyer beware. I am not going to draw the ASCII art for this, but the magic happens in the macro list_entry. A thread structure contains a list_elem. A list_elem is what links together in the linked lists. So a doubly linked list of threads is linked together by its list_elem. When you get an element from a doubly linked list of threads, (using any of the various functions in list.h) you are actually getting a list_elem. In order to get the thread object itself, you need to use the macro list_entry: /* Converts pointer to list element LIST_ELEM into a pointer to the structure that LIST_ELEM is embedded inside. Supply the name of the outer structure STRUCT and the member name MEMBER of the list element. See the big comment at the top of the file for an example. */ For threads, the outer structure is 'struct thread' and its list_elem member is 'elem'.