Interlocked Singly Linked Lists (for Ben or anyone else). [Win32 Kernel]

Prev: Howto convert IPv6 from WTSClientAddress to string?
Next: Application Error, The memory could not be "read". How to debug?

From: Pops on 14 Nov 2007 08:35

Ben (C++ MVP) recommended using the Interlocked Singly Linked Lists API:

http://msdn2.microsoft.com/en-us/library/ms686962.aspx

as a more efficient synchronization method for a recent thread
discussion in regards to a FIFO queue design.

If Ben is reading or if anyone else is familiar with this API, I have a
few questions:

First, based on what I see, this is for LIFO designs (i.e., stacks)?
Not FIFO. Am I missing something? I only see Push/Pop functions, no
head/tail functions.

Seconds, can you tell how it works with IPC, can it work for IPC?

Thanks

--
HLS

From: Ben Voigt [C++ MVP] on 14 Nov 2007 09:26

From: Pops on 14 Nov 2007 10:41

Ben Voigt [C++ MVP] wrote:

> We were discussing a multiple writer, single consumer queue. For this the
> writers each use the InterlockedPushEntrySList function, while the consumer
> uses the InterlockedFlushSList function to retrieve the entire queue as an
> atomic operation. The consumer is then responsible for iterating through
> the list of items retrieved in FIFO order.

Gotcha!

>> Seconds, can you tell how it works with IPC, can it work for IPC?
>
> Because there is no name assigned, the SList cannot establish an IPC
> channel. However, because the caller has control over the placement of all
> data structures involved, it should be possible to use with shared memory.
> It would be highly desirable in such cases to obtain the same virtual
> address space in each of the processes. If this is not possible, then you
> probably can't use the Win32 API implementation, though the underlying
> algorithm will still work (using relative addresses / indices instead of
> pointers).

Ok. Right.

I was interested to see if it can be applied in our virtual I/O
communication/channel framework which currently uses RPC. Probably won't
apply for remote machines operations, but might prove useful as an
improved alternative to local RPC operations.

My whole mantra these days is to increase client/server performance and
scalability. So this api might just have some purpose for us.

--
HLS

From: Chris Thomasson on 14 Nov 2007 12:24

"Pops" <dude(a)nospamnospamnospam.com> wrote in message
news:%23tndYNsJIHA.4712(a)TK2MSFTNGP04.phx.gbl...
> Ben (C++ MVP) recommended using the Interlocked Singly Linked Lists API:
>
> http://msdn2.microsoft.com/en-us/library/ms686962.aspx
>
> as a more efficient synchronization method for a recent thread discussion
> in regards to a FIFO queue design.
>
> If Ben is reading or if anyone else is familiar with this API, I have a
> few questions:
>
> First, based on what I see, this is for LIFO designs (i.e., stacks)? Not
> FIFO. Am I missing something? I only see Push/Pop functions, no
> head/tail functions.
>
> Seconds, can you tell how it works with IPC, can it work for IPC?

Yes. You can create a node cache in shared memory and allocate/deallocate
your SLIST_ENTRY from there. Something like:

(quick and dirty pseudo-code sketch)
____________________________________________________________

#define L2CACHE_LINESZ() 128
#define CACHE_DEPTH() 1024

typedef struct shm_node_s shm_node;
typedef struct shm_node_data_s shm_node_data;
typedef struct shm_node_cache_s shm_node_cache;

struct shm_node_data_s {
/* some data */
};

struct shm_node_s {
SLIST_ENTRY slnode;
shm_node_data data;
};

struct shm_node_cache_s {
SLIST_HEADER anchor;
char pad[L2CACHE_LINESZ() - sizeof(SLIST_HEADER)];
shm_node nodes[CACHE_DEPTH()];
};

shm_node_cache*
shm_node_cache_create(
char const* const name
) {
int i;
shm_node_cache* const _this =
/* create file map with name, alloc and align shm */

InitializeSListHead(&_this->anchor);

for(i = 0; i < CACHE_DEPTH(); ++i) {
InterlockedPushEntrySList(&_this->anchor,
&_this->nodes[i].slnode);
}

return _this;
}

shm_node*
shm_node_cache_pop(
shm_node_cache* const _this
) {
return (shm_node*)InterlockedPopEntrySList(&_this->anchor);
}

void
shm_node_cache_push(
shm_node_cache* const _this,
shm_node* const node
) {
InterlockedPushEntrySList(&_this->anchor, &node->slnode);
}

Now you can create your queue. Something like:

typedef struct shm_queue_s shm_queue;

struct shm_queue_s {
SLIST_HEADER anchor;
char pad[L2CACHE_LINESZ() - sizeof(SLIST_HEADER)];
shm_node_cache* cache;
};

shm_queue*
shm_queue_create(
shm_node_cache* const cache,
char const* const name
) {
shm_queue* const _this =
/* create file map with name, alloc and align shm */
InitializeSListHead(&_this->anchor);
_this->cache = cache;
return _this;
}

int
shm_queue_process(
shm_queue* const _this,
void (*fp_process) (shm_node_data*)
) {
int i = 0;
shm_node* node = (shm_node*)InterlockedFlushSList(&_this->anchor);

/* reverse order of node list to create FIFO */

while (node ) {
shm_node* const nx = (shm_node*)nx->slnode.Next;
fp_process(&node->data);
shm_node_cache_push(_this->cache, node);
++i;
node = nx;
}
return i;
}

BOOL
shm_queue_push(
shm_queue* const _this,
shm_node_data* const data
) {
shm_node* const node = shm_node_cache_pop(_this->cache);
if (! node) { return FALSE; }
node->data = *data;
InterlockedPushEntrySList(&_this->anchor, &node->slnode);
return TRUE;
}

____________________________________________________________

Also, for what its worth, here is how Microsoft implements the SList API
under 32-bit systems:

typedef struct slist_anchor_s slist_anchor;
typedef struct slist_node_s slist_node;

struct slist_node_s {
slist_node* nx;
};

struct slist_anchor_s {
slist_node* head;
/* monotonic counter to avoid the infamous aba problem */
__int32 aba;
};

void
slist_anchor_init(
slist_anchor* const _this
) {
_this->head = 0;
_this->aba = 0;
}

void
slist_anchor_push(
slist_anchor* const _this,
slist_node* node
) {
slist_node* cmp;
do {
cmp = _this->head;
node->nx = cmp;
} while (! CAS(&_this->anchor, cmp, node));
}

slist_node*
slist_anchor_pop(
slist_anchor* const _this
) {
slist_anchor cmp, xchg;
do {
cmp = *_this;
if (! cmp.head) { return 0; }

/* hazard! refer to PDR algorihtm for more details... */
xchg.head = cmp.head->nx;

/* increment ABA */
xchg.aba = cmp.aba + 1;

} while (! DWCAS(_this, &cmp, &xchg));

return cmp.head;
}

Any thoughts?

From: Chris Thomasson on 14 Nov 2007 12:27

| Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10
Prev: Howto convert IPv6 from WTSClientAddress to string?
Next: Application Error, The memory could not be "read". How to debug?