免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 4077 | 回复: 3
打印 上一主题 下一主题

Non-Blocking Algorith [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2008-12-05 08:17 |只看该作者 |倒序浏览
原文: http://blog.chinaunix.net/u/8057/showart_1672481.html

Author&#58; ZC Miao <>

Revision&#58;

- 2008-12-04

   Add revision information.

- 2008-11-29

   First revision.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Non-Blocking Algorithm Introduction
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



In computer science, non-blocking synchronization ensures that threads
competing for a shared resource do not have their execution indefinitely
postponed by mutual exclusion. A non-blocking algorithm is lock-free if there is
guaranteed system-wide progress; wait-free if there is also guaranteed
per-thread progress[1].


This simple diagram shows more clearly the idea&#58;


  1.     +------------------------------+--------------------------------+
  2.     |                              |                                |
  3.     |      Mutual Exclusion        |  Non-Blocking                  |
  4.     |                              |                                |
  5.     |                              |    +-------------------------+ |
  6.     |          |     |             |    |                         | |
  7.     |          |  H  |             |    |   Lock-Freedom          | |
  8.     |          |  y  |             |    |                         | |
  9.     |          |  b  |             |    |    +------------------+ | |
  10.     | Blocking |  r  | Busy-Waiting|    |    |                  | | |
  11.     |          |  i  |  (Polling)  |    |    |   Wait-Freedom   | | |
  12.     |          |  d  |             |    |    |                  | | |
  13.     |          |  s  |             |    |    +------------------+ | |
  14.     |          |     |             |    |                         | |
  15.     |                              |    +-------------------------+ |
  16.     |                              |                                |
  17.     +------------------------------+--------------------------------+
复制代码


Using mutual exclusion is the most instinctive and easiest way to protect data
from corruption in concurrent multi-threaded programming, which simply creates a
critical region around the operations of data structure. It works fine most of
the situations, but in some situations, lock is just too expensive or even not
available. One example is for some ISR(Interrupt Service Routines), if it
disables the interruption, then the scheduler will not be run until they open
the interruption again, if it also waits for some resources which is hold by
some task, then it will never available because scheduler stops. Another example
is in RTOS, higher priority wait for the lower priority task to release the
resource, then it causes priority-inversion. So, non-blocking algorithms are
born to solve these problem without using mutual exclusions.

Valois published his most cited paper talking about lock-free data structure at
1995[2]. But unfortunately, there are some mistakes with the memory management
method introduced by Valois[3].

But algorithms from Valois needs special memory management, and it's just too
complicated to use. Michael and Scott published their own algorithms based on
the idea of cooperative task from Valois[4].

There's another more detailed notes on lock-free and wait-free algorithms area
from Internet[5].

==================== Footnote ====================
[1]  Non-blocking synchronization from Wikipedia
&nbsp;    http&#58;//en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms#Wait-freedom
[2]  J.D. Valois, Lock-Free Data Structures PhD Thesis, Rensselaer Polytechnic
&nbsp;    Institute, Department of Computer Science, 1995.
[3]  Maged M. Michael, Michael L. Scott Correction of a Memory Management Method
&nbsp;    for Lock-Free Data Structures, Department of Computer Science University of
&nbsp;    Rochester, 1995.
[4]  Maged M. Michael Michael L. Scott, Simple, Fast, and Practical Non-Blocking
&nbsp;    and Blocking Concurrent Queue Algorithms,, Department of Computer Science
&nbsp;    University of Rochester, 1995.
[5]  Some notes on lock-free and wait-free algorithms
&nbsp;    http&#58;//www.audiomulch.com/~rossb/code/lockfree

[ 本帖最后由 hellwolf 于 2008-12-5 08:30 编辑 ]

评分

参与人数 1可用积分 +30 收起 理由
bitmilong + 30 原创内容

查看全部评分

论坛徽章:
0
2 [报告]
发表于 2008-12-05 08:43 |只看该作者

Non-Blocking Algorithm - Fix-size Slots Management

原文: http://blog.chinaunix.net/u/8057/showart_1673353.html

Author&#58; ZC Miao <>

Revision&#58;

- 2008-12-04

   Add revision information.

   Realistic Limitations for LL/SC.

- 2008-11-30

   First revision.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Problem
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Provide a non-blocking algorithm that maintains an array of fixed-size slots
with the following interface&#58;

- void InitSlots(void)

   Slots initialization.

- SLOTID GetSlot(void)

   Exclusively get an id of one free slot, and prevent other user to get this
   slot.

- void PutSlot(SLOTID id)

   Reclaim a slot returned by GetSlot, and make it free to get again.

- DATA* DecodeSlotID(SLOTID id)

   Decode a id of slot, and return the memory address of that slot.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Primitive CAS
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Atomic operation CAS(compare-and-swap) atomically compares the contents of a
memory location to a given value and, if they are the same, modifies the
contents of that memory location to a given new value[1]. The following
algorithm assume that CAS is available on the target hardware.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  First Algorithm(with mistakes)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-




  1. typedef struct
  2. {
  3.     int next;
  4.     DATA data;
  5. }Node;

  6. Node Slots[MAX_SLOT_NUM];

  7. int head;

  8. void InitSlots(void)
  9. {
  10.     int i;

  11.     for (i = 0; i < MAX_SLOT_NUM - 1; ++i)
  12.     {
  13.         Slots[i].next = i+1;
  14.     }
  15.     Slots[i].next = NULL_ID;
  16.     head = 0;
  17. }

  18. int GetSlot(void)
  19. {
  20.     int oldhead;
  21.     int next;

  22.     do {
  23.         oldhead  = head;
  24.         if (oldhead == NULL_ID)
  25.         {
  26.             return NULL_ID;
  27.         }
  28.         else
  29.         {
  30.             next = Slots[oldhead].next;
  31.         }
  32.     } while(!CAS(&head, oldhead, next));

  33.     return oldhead;
  34. }

  35. void PutSlot(int id)
  36. {
  37.     int oldhead;

  38.     do {
  39.         oldhead = head;
  40.         Slots[id].next = oldhead;
  41.     } while(!CAS(&head, oldhead, id));
  42. }

  43. DATA* DecodeSlotID(int id)
  44. {
  45.   return &Slots[id].data;
  46. }
复制代码



The idea of this algorithm is to use CAS to check when modify the head, if it's
still the old value. This is commonly called read-modify-write. But this arises
the well-known ABA problem.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  ABA Problem
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


The above algorithm has a subtle problem, it assumes that if the id didn't
change, then the list remains the same also. But it's very common to happen that
other tasks takes head and head.next and then returns the head, now the
head.next actually changed. This problem is known as ABA problem[2].

There are several ways to solve it. Valois gave a methodology of memory
management which tracks use count of pointers[3]. This way assures that a
pointer possessing by some one will never be allocated until no one has a copy
of the pointer, thus avoiding the ABA problem to happen. Michael and Scott
publishes their fixes on Valois's memory management mistakes[4].

Another way is to use pointer tag, which adds an extra "tag" bits to the
pointer. The "tag" usually increments itself on every copy operation. Because of
this, the next compare-and-swap will fail, even if the addresses are the same,
because the tag bits will not match. This does not completely solve the problem,
as the tag bits will eventually wrap around, but helps to avoid it.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Use Pointer Tag to Avoid ABA Problem
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-




  1. typedef union
  2. {
  3.     /** to write */
  4.     uint_t Value;

  5.     /** to write */
  6.     struct
  7.     {
  8.         /** to write */
  9.         uhalfint_t Counter;
  10.         /** to write */
  11.         uhalfint_t Index;
  12.     } Data;
  13. } SLOTID;
复制代码



Type "uhalfint_t" is half length of uint_t, uint_t is unsigned integer type. The
"Counter" here is the "tag" of the pointer.

Now the algorithm looks like this&#58;



  1. typedef struct
  2. {
  3.     SLOTID next;
  4.     DATA data;
  5. }Node;

  6. Node Slots[MAX_SLOT_NUM];

  7. SLOTID head;

  8. static inline
  9. SLOTID NewSLOTID(uhalfint_t index)
  10. {
  11.     SLOTID id;

  12.     id.Data.Counter = 0;
  13.     id.Data.Index = index;

  14.     return id;
  15. }

  16. static inline
  17. bool SLOTID_CAS(SLOTID *id, SLOTID oldid, SLOTID newid)
  18. {
  19.     /* Increae the counter to avoid ABA problem */
  20.     ++newid.Data.Counter;

  21.     return CAS(&id->Value,  oldid.Value, oldid.Value);
  22. }

  23. void InitSlots(void)
  24. {
  25.     int i;

  26.     for (i = 0; i < MAX_SLOT_NUM - 1; ++i)
  27.     {
  28.         Slots[i].next = NewSLOTID(i+1);
  29.     }
  30.     Slots[i].next = NewSLOTID(NULL_ID);
  31.     head = NewSLOTID(0);
  32. }

  33. SLOTID GetSlot(void)
  34. {
  35.     SLOTID oldhead;
  36.     SLOTID next;

  37.     do {
  38.         oldhead  = head;
  39.         if (oldhead == NULL_ID)
  40.         {
  41.             return NULL_ID;
  42.         }
  43.         else
  44.         {
  45.             next = Slots[oldhead.Data.Index].next;
  46.         }
  47.     } while(!SLOTID_CAS(&head, oldhead, next));

  48.     return oldhead;
  49. }

  50. void PutSlot(SLOTID id)
  51. {
  52.     SLOTID oldhead;

  53.     do {
  54.         oldhead = head;
  55.         Slots[id.Data.Index].next = oldhead;
  56.     } while(!SLOTID_CAS(&head, oldhead, id));
  57. }

  58. DATA* DecodeSlotID(SLOTID id)
  59. {
  60.   return &Slots[id.Data.Index].data;
  61. }
复制代码



The key algorithm is the SLOTID_CAS operation&#58; every time it's going to change
the SLOTID, it uses SLOTID_CAS, which increase the Counter then CAS. This makes
the ABA like ABA'. The index can be the same, but the counter is unlikely the
same, the wider range of Counter is, the lesser possibility ABA will happen. On
a 32-bit machine, range of Counter is &#91;0..2^16].


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Wider CAS
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


The problem of packing counter and index into a integer is obvious&#58; the
limitation number of array elements on a 32-bit machine is 2^16. And the counter
limitation is 2^16, after that it wraps to 0. 2^16 is not a big enough value to
soothe the skeptics, so on some architecture wider CAS is provided. Wider CAS is
different from Multi CAS, the former can CAS on an adjacent memory fields thus
be called wider, but the latter can CAS on unrelated memory fields.

On later inter x86 processor, it provides an instruction called CMPXCHG8B, which
compare-and-swap-8-bytes[5]. By this instruction, we can operate on a normal
memory pointer instead of a memory pointer, and its "tag" which has a range as
large as 2^32.

CAS2 is a realistic existence of Multi CAS, but only on some Motorola 680x0
processors.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Load-Link and Store-Conditional(LL/SC)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



In computer science, load-link (LL, also known as "load-linked" or "load
and reserve") and store-conditional (SC) are a pair of instructions that
together implement a lock-free atomic read-modify-write operation.
Load-link returns the current value of a memory location. A subsequent
store-conditional to the same memory location will store a new value only if no
updates have occurred to that location since the load-link. If any updates have
occurred, the store-conditional is guaranteed to fail, even if the value read by
the load-link has since been restored. As such, an LL/SC pair is stronger than a
read followed by a compare-and-swap (CAS), which will not detect updates if the
old value has been restored.[6]


LL/SC can finally make skeptics happy, it doesn't just make ABA problem look
like ABA', but solve it in another say. The algorithm with LL/SC can be&#58;



  1. typedef struct
  2. {
  3.     int next;
  4.     DATA data;
  5. }Node;

  6. Node Slots[MAX_SLOT_NUM];

  7. int head;

  8. void InitSlots(void)
  9. {
  10.     int i;

  11.     for (i = 0; i < MAX_SLOT_NUM - 1; ++i)
  12.     {
  13.         Slots[i].next = i+1;
  14.     }
  15.     Slots[i].next = NULL_ID;
  16.     head = 0;
  17. }

  18. int GetSlot(void)
  19. {
  20.     int oldhead;
  21.     int next;

  22.     do {
  23.         oldhead  = LL(&head);
  24.         if (oldhead == NULL_ID)
  25.         {
  26.             return NULL_ID;
  27.         }
  28.         else
  29.         {
  30.             next = Slots[oldhead].next;
  31.         }
  32.     } while(!SC(&head, next));

  33.     return oldhead;
  34. }

  35. void PutSlot(int id)
  36. {
  37.     int oldhead;

  38.     do {
  39.         oldhead = LL(&head);
  40.         Slots[id].next = oldhead;
  41.     } while(!SC(&head, id));
  42. }

  43. DATA* DecodeSlotID(int id)
  44. {
  45.   return &Slots[id].data;
  46. }
复制代码



To use the above algorithm, the users have to also use LL/SC to load and store
the slot id because of the ABA problem. But since realistic LL/SC implementations
all have limitations, actually LL/SC version algorithm is more difficult to use
than the CAS version.

=======================================================
    Realistic Limitations
=======================================================


Real implementations of LL/SC can be found on Alpha, PowerPC[7], MIPS, ARMv6(or
above). But they're all Weak LL/SC, SC can fail even if there's no update
between LL and corresponding SC, for example&#58;

- The CPU can only reserves a memory region at a time.

- A context switching between them can cause the SC fail.

The first limitation can cause a lot of ideal algorithms based on LL/SC fail. So
CAS version algorithm is still preferable.



=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Conclusion
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Wait-Free Fix-size Slots Management is a simple version of non-blocking memory
management implementation, it only manage fix-size slots. But it already make a
lot of Non-Blocking algorithm common problems and tricks emerging. Later we'll
see a implementation of Wait-Free Fifo Queue based on this Slots Management.


==================== Footnote ====================
[1]  Compare-and-swap from Wikipedia
&nbsp;    http&#58;//en.wikipedia.org/wiki/Compare-and-swap
[2]  ABA problem from Wikipedia
&nbsp;    http&#58;//en.wikipedia.org/wiki/ABA_problem
[3]  J.D. Valois, Lock-Free Data Structures PhD Thesis, Rensselaer Polytechnic
&nbsp;    Institute, Department of Computer Science, 1995.
[4]  Maged M. Michael, Michael L. Scott Correction of a Memory Management Method
&nbsp;    for Lock-Free Data Structures, Department of Computer Science University of
&nbsp;    Rochester, 1995.
[5]  "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A&#58;
&nbsp;    Instruction Set Reference, A-M" (PDF). Retrieved on 2007-12-15.
[6]  Load-Link/Store-Conditional from Wikipedia
&nbsp;    http&#58;//en.wikipedia.org/wiki/Load-Link/Store-Conditional
[7]  "Power ISA Version 2.05". Power.org (2007-10-23). Retrieved on 2007-12-18.

论坛徽章:
0
3 [报告]
发表于 2008-12-05 08:45 |只看该作者

Non-Blocking Algorithm - FIFO Queue

原文: http://blog.chinaunix.net/u/8057/showart_1680274.html

Author&#58; ZC Miao <>

Revision&#58;

- 2008-12-04

   First revision.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Problem
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Provide a non-blocking algorithm that provides a FIFO Queue with the following
interface&#58;

- void InitFIFO(void)

   FIFO queue initialization.

- bool Enqueue(DATA data)

   Push a data into the FIFO queue. Return TRUE if success, return FALSE only if
   the FIFO is full.

- bool Dequeue(DATA* pdata)

   Pull a data from the FIFO queue. Return TRUE if success, return FALSE only if
   the FIFO is empty.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Link-List Structure
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


The algorithm introduced here is based on a link-list structure[1]. It's a
single link-list with head and tail pointers. It looks like&#58;


  1.    +-------+     +-------+     +-------+     +-------+
  2.    |       |     |       |     |       |     |       |
  3.    |      -+---->|      -+---->|      -+---->|      -+---->NULL
  4.    |       |     |       |     |       |     |       |
  5.    +---^---+     +-------+     +-------+     +---^---+
  6.        |                                         |
  7.        |                                         |
  8.        |                                         |
  9.       HEAD                                      TAIL
复制代码


New data is pushed to the tail, and old data is pulled from head.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Dequeue Data, first thought
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


To dequeue a data, first read the HEAD value, and use CAS to swap the HEAD
pointer to it's next pointer. And the old HEAD value is the data that
dequeued. To avoid ABA problem[3], we should use SLOTID_CAS which described in
the "Fix-size Slots Management"[2].


  1.    +-------+     +-------+     +-------+     +-------+
  2.    |       |     |       |     |       |     |       |
  3.    |      -+---->|      -+---->|      -+---->|      -+----> NULL
  4.    |       |     |       |     |       |     |       |
  5.    +-------+     +---^---+     +-------+     +---^---+
  6.                      |                           |
  7.                      |                           |
  8.                      |                           |
  9.                     HEAD                        TAIL
复制代码



=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Enqueue Data
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Enqueue operation is more complicated than dequeue operation, because it has to
change two state&#58; the next pointer of the tail slot, and the tail pointer. If we
have MCAS(Multi CAS) or at least Double CAS[4], then the problem is easy to
solve. But since DCAS is not widely available in computer system, so a so-called
cooperative method is introduced firstly by Valois[5].

In cooperative method, enqueue operation first allocates a new slot, and sets
next pointer of tail slot to the new slot. Then set tail pointer to new
slot. All theses set operation use SLOTID_CAS, so the second step can fail, if
someone others enters in, and set the tail pointer. Which leaves the FIFO in a
intermediate state&#58;


  1.    +-------+     +-------+     +-------+     +-------+
  2.    |       |     |       |     |       |     |       |
  3.    |      -+---->|      -+---->|      -+---->|      -+---->NULL
  4.    |       |     |       |     |       |     |       |
  5.    +---^---+     +-------+     +---^---+     +-------+
  6.        |                           |
  7.        |                           |
  8.        |                           |
  9.       HEAD                        TAIL
复制代码


To complete the intermediate state&#58;

1. The next enqueue operation should try to move the TAIL forward, until the
    TAIL at the end, the enqueue operation should not change the TAIL slot next
    pointer.

2. The next dequeue operation should also try to move the TAIL forward when the
    head meets the tail pointer.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Algorithm
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


This algorithm uses CAS version of "Fix-size Slots Management"[2].




  1. typedef struct
  2. {
  3.     DATA data;
  4.     SLOTID next;
  5. } Node;

  6. SLOTID head;
  7. SLOTID tail;

  8. void InitFIFO(void)
  9. {
  10.     Node *sentinel;

  11.     /* This algorithm relies on the fix-size slots management */
  12.     InitSlots();

  13.     /* Sentinel slots */
  14.     head = tail = GetSlot();
  15.     sentinel = DecodeSlotID(head);
  16.     sentinel->next = NULL_ID;
  17. }

  18. /*
  19.   Origin Algorithm:
  20.   E01: node = new node()
  21.   E02: node->value = value
  22.   E03: node->next.ptr = NULL
  23.   E04: loop
  24.   E05:   tail = Q->Tail
  25.   E06:   next = tail.ptr->next
  26.   E07:   if tail == Q->Tail
  27.   E08:     if next.ptr == NULL
  28.   E09:       if CAS(&tail.ptr->next, next, <node, next.count+1>)
  29.   E10:         break
  30.   E11:       endif
  31.   E12:     else
  32.   E13:       CAS(&Q->Tail, tail, <next.ptr, tail.count+1>)
  33.   E14:     endif
  34.   E15:   endif
  35.   E16: endloop
  36.   E17: CAS(&Q->Tail, tail, <node, tail.count+1>)
  37. */
  38. bool Enqueue(DATA data)
  39. {
  40.     SLOTID newid;
  41.     Node *newnode;
  42.     SLOTID oldtail;
  43.     Node *oldtailnode;
  44.     SLOTID nextid;
  45.     Node *nextnode;

  46.     /* E01 */
  47.     newid = GetSlot();
  48.     newnode = DecodeSlotID(newid);
  49.     if (newnode == NULL) return FALSE;
  50.     /* E02 */
  51.     newnode->data = data;
  52.     /* E03 */
  53.     newnode->next = NULL_ID;
  54.     /* E04 */
  55.     do {
  56.         /* E05 */
  57.         oldtail = tail;
  58.         oldtailnode = DecodeSlotID(oldtail);
  59.         /* E06 */
  60.         nextid = oldtailnode->next;
  61.         nextnode = DecodeSlotID(nextid);
  62.         /* E07 */
  63.         if (oldtail == tail)
  64.         {
  65.             /* E08 */
  66.             if (nextnode == NULL)
  67.             {
  68.                 /* E09 */
  69.                 if (SLOTID_CAS(&oldtailnode->next, next, newid))
  70.                 {
  71.                     /* E10 */
  72.                     break;
  73.                 /* E11 */
  74.                 }
  75.             }
  76.             /* E12 */
  77.             else
  78.             {
  79.                 /* E13 */
  80.                 SLOTID_CAS(&tail, oldtail, next);
  81.             /* E14 */
  82.             }
  83.         /* E15 */
  84.         }
  85.     /* E16 */
  86.     } while(1);
  87.     /* E17 */
  88.     SLOTID_CAS(&tail, oldtail, newid);

  89.     return TRUE;
  90. }

  91. /*
  92.   Origin Algorithm:
  93.   D01:  loop
  94.   D02:    head = Q->Head
  95.   D03:    tail = Q->Tail
  96.   D04:    next = head->next
  97.   D05:    if head == Q->Head
  98.   D06:      if head.ptr == tail.ptr
  99.   D07:        if next.ptr == NULL
  100.   D08:          return FALSE
  101.   D09:        endif
  102.   D10:        CAS(&Q->Tail, tail, <next.ptr, tail.count+1>)
  103.   D11:      else
  104.   D12:        *pvalue = next.ptr->value
  105.   D13:        if CAS(&Q->Head, head, <next.ptr, head.count+1>)
  106.   D14:          break
  107.   D15:        endif
  108.   D16:      endif
  109.   D17:    endif
  110.   D18:  endloop
  111.   D19:  free(head.ptr)
  112.   D20:  return TRUE
  113. */
  114. bool Dequeue(DATA *pdata)
  115. {
  116.     SLOTID oldhead
  117.     Node *oldheadnode;
  118.     SLOTID oldtail;
  119.     Node *oldtailnode;
  120.     SLOTID next;
  121.     Node *nextnode;

  122.     /* D01 */
  123.     do {
  124.         /* D02 */
  125.         oldhead = head;
  126.         oldheadnode = DecodeSlotID(oldhead);
  127.         /* D03 */
  128.         oldtail = tail;
  129.         oldtailnode = DecodeSlotID(oldtail);
  130.         /* D04 */
  131.         next = oldhead->next;
  132.         nextnode = DecodeSlotID(next);
  133.         /* D05 */
  134.         if (oldhead == head)
  135.         {
  136.             /* D06 */
  137.             if (headnode == tailnode)
  138.             {
  139.                 /* D07 */
  140.                 if (nextnode == NULL)
  141.                 {
  142.                     /* D08 */
  143.                     return FALSE;
  144.                 /* D09 */
  145.                 }
  146.                 /* D10 */
  147.                 SLOTID_CAS(&tail, oldtail, next);
  148.             }
  149.             /* D11 */
  150.             else
  151.             {
  152.                 /* D12 */
  153.                 *pdata = nextnode->data;
  154.                 /* D13 */
  155.                 if (SLOTID_CAS(&head, oldhead, next)
  156.                 {
  157.                     /* D14 */
  158.                     break;
  159.                 /* D15 */
  160.                 }
  161.             /* *D16 /
  162.             }
  163.         /* D17 */
  164.         }
  165.     /* D18 */
  166.     }
  167.     /* D19 */
  168.     PutSlot(oldhead);
  169.     /* D20 */
  170.     return TRUE;
  171. }
复制代码



In enqueue, E13 is the cooperative operation.

In dequeue, D10 is the cooperative operation.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Array Based Algorithm
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Valois also introduced a array based Non-Blocking FIFO algorithm[6], but it's
not feasible on real machine because it requires CAS on unaligned memory region.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Conclusion
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


This article introduces a link-listed based Non-Blocking algorithm for FIFO
queue. It's based on the "Fix-size Slots Management"[2], and feasible on the
platforms that support CAS.

==================== Footnote ====================
[1]  Maged M. Michael Michael L. Scott, Simple, Fast, and Practical Non-Blocking
&nbsp;    and Blocking Concurrent Queue Algorithms,, Department of Computer Science
&nbsp;    University of Rochester, 1995.
[2]  ZC Miao, Non-Blocking Algorithm - Fix-size Slots Management, revision
&nbsp;    2008-12-04
[3]  ABA problem from Wikipedia
&nbsp;    http&#58;//en.wikipedia.org/wiki/ABA_problem
[4]  Double Compare And Swap from Wikipedia
&nbsp;    http&#58;//en.wikipedia.org/wiki/Double_Compare_And_Swap
[5]  J.D. Valois, Lock-Free Data Structures PhD Thesis, Rensselaer Polytechnic
&nbsp;    Institute, Department of Computer Science, 1995.
[6]  J.D. Valois, Implementing Lock-Free Queues

论坛徽章:
0
4 [报告]
发表于 2008-12-05 09:03 |只看该作者
不错, 感谢分享
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP