Hi All
I have a virtualised (VMWare) RDS 2012R2 environment with 20 Session hosts spread across 6 Dell ESXI Hosts - 2 Sets of different PowerEdge Models. Over the past 4-6 weeks we have started to get multiple event 7011's followed by a 7046.
A timeout (30000 milliseconds) was reached while waiting for a transaction response from the UmRdpService service.
The following service has repeatedly stopped responding to service control requests: Remote Desktop Services UserMode Port Redirector
At this point some existing connected users cant sign out and applications start to crash including explorer.exe. Trying to shutdown via the GUI just hangs and the only way to get the server back is to reset the power using vSphere console.
Applications on the Session Hosts are mainly MS Office 2016, Acrobat Reader, 7Zip and Webroot AV. Windows OS and applications are fully patched and up to date and Dell Firmware and drivers are fully up to date.
Users connect in via RemoteApp and local drives and printers are redirected into their sessions.
The weird thing is, like clockwork the crashes happen at the end of each day usually between 16:00 - 18:00 - To me its like a degradation symptom or perhaps its the actions of users disconnecting or logging off their session - Its affecting a couple of servers
each day.
On top of this, it appears 7011, 7046 results in a BSOD. I have grabbed the Memory.dmp file and opened it with WinDbg.
Im now trying to figure out the dmp - uploaded to PasteBin
here (happy to paste dmp here but didnt want to "dump" to much information in the post)
What stands out to me is rdbss.sys
Probably caused by : rdbss.sys ( rdbss!__RxAcquireFcb+1f3 )
IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 0000000000000000, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80179d3ba44, address which referenced memory
BUCKET_ID: AV_rdbss!__RxAcquireFcb
PRIMARY_PROBLEM_CLASS: AV_rdbss!__RxAcquireFcb
My rdbss.sys version - 6.3.9600.18895
Can anyone help to try and decipher the above and suggest next/best cause of action?
Many thanks :)