Next: Appendix: Deadlock Alternatives
Up: 4560
Previous: UNIX: System Design
  Contents
- CTSS - multiprogramming and timesharing (1962)
- MULTICS - virtual memory (1965)
- a process
- large and complex - all things to all people
- command line interpreter
- UNIX - a pun on MULTICS (1969)
- stripped-down version
- ``C''
- Rochester RIG - message passing over a network (1976)
- CMU Unix - port capabilities as object references (1979)
- Accent - integration of memory and IPC (1985)
- Mach - designed for multiprocessors (1989)
- incorporated recent innovations
- a few simple/powerful abstractions (and they interoperate)
Mach: What about UNIX? [293]
UNIX good points:
- multiprogrammed
- easy portability to wide class of uniprocessors
- simple programmer interface to system facilities
- extensive library
- pipes
- UNIX bad points:
- not intended for multiprocessors
- kernel became repository for redundant/competing abstractions
- for example - IPC:
- sockets, streams, pipe
- shared files, system V messages, shared memory
Mach: Basic Goals [294]
- simplicity: 5 powerful user abstractions which interoperate
- integrated memory management and IPC
- extensibility:
- kernel functions may be efficiently exported to user state
- leaves just a microkernel
- compatibility: UNIX programs (binaries) fully supported
- multiprocessor systems - even heterogenous
- large and different types of memories
- Uniform Memory Access (UMA)
- Non-Uniform Memory Access (NUMA)
- No Remote Memory Access (NORMA)
- transparent access to different types of networks
- LANs and WANs
- tightly coupled multiprocessors
Mach: Structure and Emulation [295]
- many OS run on top of Mach
- BSD provides the user interface/programming environment
Mach: UNIX Emulation in Mach [296]
- Mach kernel acts as a ``trampoline''
Mach: Primitive Abstractions [297]
- task: execution environment
- provides virtual address space
- provides protected access to system resources via ports
- contains one or more threads
- it is not a process: computationally passive
- thread: unit of computation (execution)
- must run in the context of a task
- task provides address space
- all threads in a task share ports, memory, etc.
- process = task + thread
- minimal state information IMPLIES lightweight process
Mach: Primitive Abstractions [298]
- ports: communication channel
- send/receive messages on ports
- kernel maintains capability list of rights to send/receive
- port set is a group of ports sharing a common message queue
- thread can receive on a port set - service multiple ports
- messages: basic method of communication between threads
- ``typed'' (self-describing) data - can be 4GB
- in-line or out-of-line (pointer) data
- port rights are passed in messages (the only way)
- memory objects: storage unit
- tasks access objects by using ports!
- map all or part of object into address space
- object may be managed by external memory manager
- examples: files, pipe
Summary of Primitive Abstractions [299]
Mach: Blend Memory and IPC [300]
- Memory Management:
- memory object represented by port
- IPC messages are sent to port (e.g. pagein, pageout)
- memory objects may easily reside on remote systems
- IPC:
- try to pass messages by moving pointers to shared memory
- try to avoid, or at least delay, copying
- let virtual memory management do the copying
Mach: Process Management - Tasks [301]
- system calls to kernel: messages on process port
- create: parent task creates children tasks
- children inherit all or selected regions of parent's memory
- shared or copied
- priority: for current or future threads
- assign: (set of) processor for new threads
- suspend: all threads in task
- resume: all threads in task
- terminate: all threads in task
Mach: Process Management - Threads [302]
- create: give function to execute and its parameters
- suspend: one thread but not the task
- resume: one thread but task may still be suspended
- all threads share the process port and other ports
- each thread has its own thread port say to terminate
- all threads share the address space of the task
- implies need synchronization
Mach: Thread Synchronization [303]
- mutex_lock(mutex): a wait on mutex but with a spinlock
- mutex_unlock(mutex): a signal
- condition variables:
- implement critical sections without busy waiting
- condition_wait(condition variable,mutex variable):
- unlocks mutex variable
- blocks for a condition_signal(condition variable)
- condition_signal(condition variable):
- sets condition variable to true and unblocks (all) waiting threads
- condition may not hold when wait returns
- implies need a loop for wait
Producer-Consumer Synchronization [304]
INITIALIZATION:
int buffer[MAXBUF]; int buf_ptr = -1;
int nonempty = FALSE; int nonfull = TRUE;
mutex_alloc(mutex,1); condition_alloc(nonempty,nonfull);
void add_buffer(int item) { int rem_buffer() {
buf_ptr++; int item = buffer[buf_ptr];
buffer[buf_ptr] = item; buf_ptr--;
empty = FALSE; full = FALSE;
if (buf_ptr == MAXBUF-1) full = TRUE; if (buf_ptr==-1) empty=TRUE;
} return(item);
}
PRODUCER: CONSUMER:
while (1) { while (1) {
nextp = produce_item();
mutex_lock(mutex); mutex_lock(mutex);
while (full) while (empty)
condition_wait(nonfull,mutex); condition_wait(nonempty,mutex);
add_buffer(nextp); nextc = rem_buffer();
condition_signal(nonempty); condition_signal(nonfull);
mutex_unlock(mutex); mutex_unlock(mutex);
consume_item(nextc);
} }
TERMINATION: mutex_free(mutex); condition_free(nonempty,nonfull);
Mach: CPU Scheduling [305]
- only threads are scheduled - no knowledge of task is needed
- CPUs and threads assigned to processor sets (independently)
- implies threads that need computing power and CPUs at disposal
- thread has priority:
- base priority set by thread (within a limit)
- current priority = base priority + f(recent CPU usage)
- 32 global run queues for each processor set: one for each priority
- lock the global run queues
- find the highest priority thread (use hints)
- 1 ``highest-priority'' local run queue for ``CPU'' threads: I/O devices
- thread is given one quantum to run:
- check queues again; if empty or low priorities, go again
- with each tick: give thread a lower priority
- quantum is constant across a processor set
- quantum increases as CPUs go up - or threads go down
Mach: Ports [306]
- bounded queue within the kernel
- capability: send or receive ``right''
- only one receiver for each port (but must have right)
- multiple senders for each port (but must have right)
- allocate: new port (and get the rights)
- creator can give out rights in messages
- if receive right sent in a message, sender loses the right
- task allocates ports to the objects that it owns
- deallocate: revocation of all rights
- port sets: can only have receive rights
Mach: Ports and Capabilities [307]
Mach: Network Messages [308]
- NetMsgServer (NMS) is in user space
- R sends S a message with SEND right: NMS creates proxy X
- S sends R a message via proxy X: NMS delivers to port X
- networkwide name server: allows tasks to register ports for lookup
Mach: Messages [309]
- fixed-length header and variable number of typed data objects
- data
- port rights
- pointers to out-of-line data
- send message
- SEND_TIMEOUT: sending data too fast
- SEND_NOTIFY: if cannot be sent now, notify when OK
- receive message
- RCV_TIMEOUT: block for only so long
- RCV_NO_SENDERS: return if no senders
Mach: Messages - Out-of-Line Data [310]
- pointer would be invalid in receiver's address space
- copy-on-write:
- put the virtual memory map into the receiver's space
- faster than copying the data itself
- if receiver only reads - OK
- if receiver writes to a page - protection fault
- distinction: read-only vs. copy-on-write
- make copy of just the page, map it to receiver's space
Mach: Memory Management [311]
- object-oriented: message to port associated with memory object
- page fault: message to the object's port
- user-level memory managers instead of kernel
- external pager
- task may not have manager for a region
- (not on secondary storage)
- manager may fail to reduce resident pages when asked
- use Mach's default memory manager
- FIFO with ``second chance''
- secondary storage: just like any other object
- physical memory: cache onto memory objects
Mach: Virtual Memory [312]
- 32-bits IMPLIES 4GB
- 1K page size IMPLIES 4 million page table entries
- Mach's virtual memory: sparse
- allocate region of virtual memory
- specify base VM address and size (say 50MB)
- used for file objects, large messages
- regions can be shared/inherited with other tasks
- deallocate region
- many holes of unallocated VM space
- page table: not the regular kind
- entries for only currently allocated regions
- cannot simply index into the table
- check for page in valid region
- address map
Mach: Address Map for Sparse VM [313]
Distributed Shared Memory Server [314]
- shared page is readable: may be replicated on multiple machines
- shared page is writable: only one copy
- DSM knows which machines have the page
- reader writes to the page: DSM sends messages to kernel
- upon acknowledgement: single writer is given permission
Index to Slides: LEFT BLANK [315]
Index to Slides: LEFT BLANK [316]
Next: Appendix: Deadlock Alternatives
Up: 4560
Previous: UNIX: System Design
  Contents
Ted Billard
2001-11-17