Discussion:
[rabbitmq-users] 3.5.4 rc1 flow control issue with .Net (nonpersistent messages, pub no sub)
Raymond Rizzuto
2015-07-13 18:31:12 UTC
Permalink
After publishing ~1.2 million non-persistent messages to a topic with 1
queue bound, and no consumers, I saw the throughput drop to 0, and a little
wile later the publisher logged 4 instances of the following exception.
Note that both the server and client code are using 3.5.4 rc1.

The exception isn't my main concern. My concern is that the throughput
went from 10k/second to 0, stayed there for 4 minutes, went up to ~2k/sec,
stayed there for ~2 minutes, then recovered back to 10k/sec. Preliminary
results show it repeating the cycle. I will try with persistent messages
next.


Caught IOException: System.IO.IOException: Unable to write data to the
transport
connection: A connection attempt failed because the connected party did
not pro
perly respond after a period of time, or established connection failed
because c
onnected host has failed to respond. --->
System.Net.Sockets.SocketException: A
connection attempt failed because the connected party did not properly
respond a
fter a period of time, or established connection failed because connected
host h
as failed to respond
at System.Net.Sockets.Socket.Send(Byte[] buffer, Int32 offset, Int32
size, So
cketFlags socketFlags)
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset,
Int32
size)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset,
Int32
size)
at System.IO.BufferedStream.Flush()
at System.IO.BinaryWriter.Flush()
at RabbitMQ.Client.Impl.SocketFrameHandler.WriteFrameSet(IList`1 frames)
in c
:\cygwin64\var\tmp\rabbit-build.13152\rabbitmq-public-umbrella\rabbitmq-dotnet-c
lient\projects\client\RabbitMQ.Client\src\client\impl\SocketFrameHandler.cs:line
241
at RabbitMQ.Client.Impl.Command.TransmitAsFrameSet(Int32 channelNumber,
Conne
ction connection) in
c:\cygwin64\var\tmp\rabbit-build.13152\rabbitmq-public-umbr
ella\rabbitmq-dotnet-client\projects\client\RabbitMQ.Client\src\client\impl\Comm
and.cs:line 218
at RabbitMQ.Client.Impl.Command.Transmit(Int32 channelNumber, Connection
conn
ection) in
c:\cygwin64\var\tmp\rabbit-build.13152\rabbitmq-public-umbrella\rabbi
tmq-dotnet-client\projects\client\RabbitMQ.Client\src\client\impl\Command.cs:lin
e 159
at RabbitMQ.Client.Impl.SessionBase.Transmit(Command cmd) in
c:\cygwin64\var\
tmp\rabbit-build.13152\rabbitmq-public-umbrella\rabbitmq-dotnet-client\projects\
client\RabbitMQ.Client\src\client\impl\SessionBase.cs:line 0
at RabbitMQ.Client.Impl.ModelBase.ModelSend(MethodBase method,
ContentHeaderB
ase header, Byte[] body) in
c:\cygwin64\var\tmp\rabbit-build.13152\rabbitmq-publ
ic-umbrella\rabbitmq-dotnet-client\projects\client\RabbitMQ.Client\src\client\im
pl\ModelBase.cs:line 436
at RabbitMQ.Client.Framing.Impl.Model._Private_BasicPublish(String
exchange,
String routingKey, Boolean mandatory, Boolean immediate, IBasicProperties
basicP
roperties, Byte[] body) in
c:\cygwin64\var\tmp\rabbit-build.13152\rabbitmq-publi
c-umbrella\rabbitmq-dotnet-client\gensrc\RabbitMQ.Client\autogenerated-api-0-9-1
.cs:line 3869
at RabbitMQ.Client.Impl.ModelBase.BasicPublish(String exchange, String
routin
gKey, Boolean mandatory, Boolean immediate, IBasicProperties
basicProperties, By
te[] body) in
c:\cygwin64\var\tmp\rabbit-build.13152\rabbitmq-public-umbrella\ra
bbitmq-dotnet-client\projects\client\RabbitMQ.Client\src\client\impl\ModelBase.c
s:line 1246
at RabbitMQ.Client.Impl.AutorecoveringModel.BasicPublish(String
exchange, Str
ing routingKey, IBasicProperties basicProperties, Byte[] body) in
c:\cygwin64\va
r\tmp\rabbit-build.13152\rabbitmq-public-umbrella\rabbitmq-dotnet-client\project
s\client\RabbitMQ.Client\src\client\impl\AutorecoveringModel.cs:line 827
at rabbitPublisher.rabbitPublisher.Main(String[] args) in
c:\Users\rizzuto\Do
cuments\Visual Studio
2012\Projects\queuing\rabbitPublisher\rabbitPublisher.cs:l
ine 69
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-13 18:37:47 UTC
Permalink
Have you run into a memory alarm?

MK
The exception isn't my main concern. My concern is that the throughput went from 10k/second to 0, stayed there for 4 minutes
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Alvaro Videla
2015-07-13 19:05:15 UTC
Permalink
Have you tried this test with versions older than 3.5.4 rc1? ie: is this a
regression, or is this the first time you run said test?
Post by Michael Klishin
Have you run into a memory alarm?
MK
Post by Raymond Rizzuto
The exception isn't my main concern. My concern is that the throughput
went from 10k/second to 0, stayed there for 4 minutes
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 12:41:46 UTC
Permalink
I had been unsuccessful in prior releases because the connection would
break when the .Net publisher was placed in flow control. This release
candidate fixes that issue. I am retesting now, and seeing other potential
issues.
Post by Alvaro Videla
Have you tried this test with versions older than 3.5.4 rc1? ie: is this a
regression, or is this the first time you run said test?
Post by Michael Klishin
Have you run into a memory alarm?
MK
Post by Raymond Rizzuto
The exception isn't my main concern. My concern is that the throughput
went from 10k/second to 0, stayed there for 4 minutes
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-14 12:43:35 UTC
Permalink
Post by Raymond Rizzuto
I had been unsuccessful in prior releases because the connection
would break when the .Net publisher was placed in flow control.
This release candidate fixes that issue. I am retesting now,
and seeing other potential issues.
I’m going to produce another RC in a bit.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-13 19:06:58 UTC
Permalink
How would I know if there was a memory alarm? I checked the server log and
see nothing indicating a memory alarm.

I looked at the web UI, and it lists a 6.4G high water mark, and my system
has 16G.
Post by Michael Klishin
Have you run into a memory alarm?
MK
Post by Raymond Rizzuto
The exception isn't my main concern. My concern is that the throughput
went from 10k/second to 0, stayed there for 4 minutes
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-13 19:20:38 UTC
Permalink
There would be visible messages in the log.

MK
How would I know if there was a memory alarm? I checked the server log and see nothing indicating a memory alarm.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-13 20:47:02 UTC
Permalink
There are none.
Post by Michael Klishin
There would be visible messages in the log.
MK
Post by Raymond Rizzuto
How would I know if there was a memory alarm? I checked the server log
and see nothing indicating a memory alarm.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-13 21:17:46 UTC
Permalink
Post by Raymond Rizzuto
I looked at the web UI, and it lists a 6.4G high water mark, and
my system has 16G.
And how much RAM is actually used?

Since you say you had messages routed to a queue without consumers, it can be that the amount has crossed vm_memory_high_watermark_paging_ratio [1]
and RabbitMQ had to move a lot of data to disk. With a few gigs it could take
a few minutes on spinning disks. 

1. https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq.config.example
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 14:01:56 UTC
Permalink
My PC before the test had 10G free ram, so I do not think I am hitting a
memory issue.

I suspect it is like you say that I am hitting the point where the
nonpersistent messages need to be offloaded from memory, and during that
time performance would reasonably be impaired.
The times seem long to me, especially since I have the db configured to use
a separate SSD dedicated to the rabbitmq persistence. I did a test, and I
can copy 1 gig directory from ssd to ssd in 20 seconds, so memory to ssd
would likely be that or better. Yet I saw 4+ minutes with 0 or low
throughput.

I'll try persistent next. If I understand correctly, they get persisted as
they are received, so there shouldn't be any big pause to persist a 1
million messages due to memory pressure.
Post by Michael Klishin
Post by Raymond Rizzuto
I looked at the web UI, and it lists a 6.4G high water mark, and
my system has 16G.
And how much RAM is actually used?
Since you say you had messages routed to a queue without consumers, it can
be that the amount has crossed vm_memory_high_watermark_paging_ratio [1]
and RabbitMQ had to move a lot of data to disk. With a few gigs it could take
a few minutes on spinning disks.
1.
https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/rabbitmq.config.example
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-14 14:54:39 UTC
Permalink
If I understand correctly, they get persisted as they are received,
so there shouldn't be any big pause to persist a 1 million messages
due to memory pressure.
Messages can be kept in RAM even after they are moved to disk. 
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-14 14:59:18 UTC
Permalink
Post by Michael Klishin
Messages can be kept in RAM even after they are moved to disk.
When you observe throughput drop, please run rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
every 10-15 seconds, and monitor disk I/O activity. 
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 15:29:08 UTC
Permalink
OK, I will retest the nonpersistent message case and see what that
information provides.
Post by Michael Klishin
Post by Michael Klishin
Messages can be kept in RAM even after they are moved to disk.
When you observe throughput drop, please run rabbitmqctl
eval 'rabbit_diagnostics:maybe_stuck().'
every 10-15 seconds, and monitor disk I/O activity.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 15:59:58 UTC
Permalink
I'm attaching an image from the rabbitmq web ui which seems to show that
messages stopped, but disk io only had a brief burst.

I'm having issues with trying to use the rabbitmqctl command:

C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.90\sbin>rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
The system cannot find the path specified.


I have verified that rabbitmqctl.bat is in the current directory. I
noticed the same issue there as well with this line:

call "%cd%\rabbitmq-env.bat"


I tried changing it %cd% to %TDP0%, but I get the same error.
Post by Raymond Rizzuto
OK, I will retest the nonpersistent message case and see what that
information provides.
Post by Michael Klishin
Post by Michael Klishin
Messages can be kept in RAM even after they are moved to disk.
When you observe throughput drop, please run rabbitmqctl
eval 'rabbit_diagnostics:maybe_stuck().'
every 10-15 seconds, and monitor disk I/O activity.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 16:31:30 UTC
Permalink
My bad - I had missed setting ERLANG_HOME. Please do update the %cd% to
%TDP0% in rabbitmqctl.bat, however.
Post by Raymond Rizzuto
I'm attaching an image from the rabbitmq web ui which seems to show that
messages stopped, but disk io only had a brief burst.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.90\sbin>rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
The system cannot find the path specified.
I have verified that rabbitmqctl.bat is in the current directory. I
call "%cd%\rabbitmq-env.bat"
I tried changing it %cd% to %TDP0%, but I get the same error.
Post by Raymond Rizzuto
OK, I will retest the nonpersistent message case and see what that
information provides.
Post by Michael Klishin
Post by Michael Klishin
Messages can be kept in RAM even after they are moved to disk.
When you observe throughput drop, please run rabbitmqctl
eval 'rabbit_diagnostics:maybe_stuck().'
every 10-15 seconds, and monitor disk I/O activity.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 16:35:55 UTC
Permalink
I'm not sure if this is a windows vs linux thing, but I now get a different
error message:

C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.90\sbin>rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
Error: could not recognise command
Usage:
rabbitmqctl [-n <node>] [-t <timeout>] [-q] <command> [<command options>]
.
.
.
Post by Raymond Rizzuto
My bad - I had missed setting ERLANG_HOME. Please do update the %cd% to
%TDP0% in rabbitmqctl.bat, however.
Post by Raymond Rizzuto
I'm attaching an image from the rabbitmq web ui which seems to show that
messages stopped, but disk io only had a brief burst.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.90\sbin>rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
The system cannot find the path specified.
I have verified that rabbitmqctl.bat is in the current directory. I
call "%cd%\rabbitmq-env.bat"
I tried changing it %cd% to %TDP0%, but I get the same error.
Post by Raymond Rizzuto
OK, I will retest the nonpersistent message case and see what that
information provides.
Post by Michael Klishin
Post by Michael Klishin
Messages can be kept in RAM even after they are moved to disk.
When you observe throughput drop, please run rabbitmqctl
eval 'rabbit_diagnostics:maybe_stuck().'
every 10-15 seconds, and monitor disk I/O activity.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 19:57:52 UTC
Permalink
I am seeing the similar results with nonpersistent messages. After
publishing 10K msgs/sec (size=1K) for less than 3 minutes, the following
happens:


- performance drops dramatically
- there is a disk write spike according to the web ui
- the .net side catches a system.io.ioexception

I also see the following in the server log:

=INFO REPORT==== 14-Jul-2015::15:45:57 ===
accepting AMQP connection <0.322.0> ([::1]:57281 -> [::1]:5672)

=INFO REPORT==== 14-Jul-2015::15:50:35 ===
accepting AMQP connection <0.474.0> ([::1]:57331 -> [::1]:5672)

=WARNING REPORT==== 14-Jul-2015::15:50:38 ===
closing AMQP connection <0.322.0> ([::1]:57281 -> [::1]:5672):
connection_closed_abruptly

Tomorrow I will try to get more instrumentation on this, possibly a packet
capture as well as more detailed logging with timestamps on the .net side.
Post by Raymond Rizzuto
I'm not sure if this is a windows vs linux thing, but I now get a
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.90\sbin>rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
Error: could not recognise command
rabbitmqctl [-n <node>] [-t <timeout>] [-q] <command> [<command options>]
.
.
.
Post by Raymond Rizzuto
My bad - I had missed setting ERLANG_HOME. Please do update the %cd%
to %TDP0% in rabbitmqctl.bat, however.
Post by Raymond Rizzuto
I'm attaching an image from the rabbitmq web ui which seems to show that
messages stopped, but disk io only had a brief burst.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.90\sbin>rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
The system cannot find the path specified.
I have verified that rabbitmqctl.bat is in the current directory. I
call "%cd%\rabbitmq-env.bat"
I tried changing it %cd% to %TDP0%, but I get the same error.
Post by Raymond Rizzuto
OK, I will retest the nonpersistent message case and see what that
information provides.
Post by Michael Klishin
Post by Michael Klishin
Messages can be kept in RAM even after they are moved to disk.
When you observe throughput drop, please run rabbitmqctl
eval 'rabbit_diagnostics:maybe_stuck().'
every 10-15 seconds, and monitor disk I/O activity.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-14 20:00:00 UTC
Permalink
Post by Raymond Rizzuto
performance drops dramatically
there is a disk write spike according to the web ui
the .net side catches a system.io.ioexception
=INFO REPORT==== 14-Jul-2015::15:45:57 ===
accepting AMQP connection <0.322.0> ([::1]:57281 -> [::1]:5672)
=INFO REPORT==== 14-Jul-2015::15:50:35 ===
accepting AMQP connection <0.474.0> ([::1]:57331 -> [::1]:5672)
=WARNING REPORT==== 14-Jul-2015::15:50:38 ===
connection_closed_abruptly
Which to me suggests that messages are being paged out, and as queue processes
are busy moving data to disk, they stop accepting incoming messages, after which
flow control eventually kicks in and writes in the client time out and result in
an exception.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-15 15:25:21 UTC
Permalink
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks. 
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-16 18:46:09 UTC
Permalink
I re-read http://www.rabbitmq.com/memory.html, and if I understand
correctly, the paging to disk should happen before flow control is
initiated. In my tests it seemed like the flow control was initiated
first. Is there any way to determine when these events occured? Log
files, etc?
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-16 20:44:54 UTC
Permalink
Post by Raymond Rizzuto
I re-read http://www.rabbitmq.com/memory.html, and if I
understand correctly, the paging to disk should happen before
flow control is initiated. In my tests it seemed like the flow
control was initiated first. Is there any way to determine when
these events occured? Log files, etc?
Paging to disk (typically, with default settings) happens before *resource-driven* alarms.

If a queue is busy moving data to disk and does not acknowledge processing of Erlang messages
in its mailbox to the channel(s) that send them, internal flow control will kick in. It is
typically temporary (can go in and out multiple times a second) but with a lot of I/O
to perform, it can take longer.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-17 14:17:33 UTC
Permalink
I updated to 3.5.4 rc2, and the batch files are better. However, I still
can't run the command you recommended:

C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
Error: syntax error before:

I also tried this example from the manual, with similar results.

C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'node().'
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-17 14:18:43 UTC
Permalink
I updated to 3.5.4 rc2, and the batch files are better. However,
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
We will take a look, thanks. 
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
'Robert Raschke' via rabbitmq-users
2015-07-17 17:17:28 UTC
Permalink
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
Post by Raymond Rizzuto
I updated to 3.5.4 rc2, and the batch files are better. However, I still
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'node().'
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-17 17:34:33 UTC
Permalink
II retried both commands I was having with double quotes instead of single
quotes, and got a bit further:

C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "node()."
***@CYAN199B

C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}


The first command works, the second command is still saying "Error:", but
getting further.

It might be worth indicating that double quotes need to be used, at least
on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
Post by Raymond Rizzuto
I updated to 3.5.4 rc2, and the batch files are better. However, I still
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'node().'
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Alvaro Videla
2015-07-17 18:20:42 UTC
Permalink
The module is called rabbit_diagnostics with an 's' at the end, that's why
you get that error.

undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of single
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying "Error:", but
getting further.
It might be worth indicating that double quotes need to be used, at least
on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I still
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'node().'
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-17 18:24:49 UTC
Permalink
Good eyes! Thanks, that works, and this is what I got:

There are 206 processes.

Investigated 1 processes this round, 5000ms to go.

Investigated 1 processes this round, 4500ms to go.

Investigated 1 processes this round, 4000ms to go.

Investigated 1 processes this round, 3500ms to go.

Investigated 1 processes this round, 3000ms to go.

Investigated 1 processes this round, 2500ms to go.

Investigated 1 processes this round, 2000ms to go.

Investigated 1 processes this round, 1500ms to go.

Investigated 1 processes this round, 1000ms to go.

Investigated 1 processes this round, 500ms to go.

Found 1 suspicious processes.

[{pid,<5360.30.0>},

{registered_name,user},

{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},

{user,do_io_request,5,[{file,"user.erl"},{line,182}]},

{user,server_loop,2,[{file,"user.erl"},{line,132}]},

{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},

{initial_call,{erlang,apply,2}},

{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},

{message_queue_len,0},

{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},

{monitors,[]},

{monitored_by,[<5360.48.0>]},

{heap_size,233}]

ok

Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end, that's why
you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying "Error:", but
getting further.
It might be worth indicating that double quotes need to be used, at least
on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I still
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'node().'
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-17 20:58:10 UTC
Permalink
FWIW, I see a similar mention of the issue I am seeing
at http://stackoverflow.com/questions/21666537/rabbitmq-memory-control-queue-is-full-and-is-not-paging-connection-hangs.
Interestingly, that is on Linux, and a year ago.
Post by Raymond Rizzuto
There are 206 processes.
Investigated 1 processes this round, 5000ms to go.
Investigated 1 processes this round, 4500ms to go.
Investigated 1 processes this round, 4000ms to go.
Investigated 1 processes this round, 3500ms to go.
Investigated 1 processes this round, 3000ms to go.
Investigated 1 processes this round, 2500ms to go.
Investigated 1 processes this round, 2000ms to go.
Investigated 1 processes this round, 1500ms to go.
Investigated 1 processes this round, 1000ms to go.
Investigated 1 processes this round, 500ms to go.
Found 1 suspicious processes.
[{pid,<5360.30.0>},
{registered_name,user},
{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},
{user,do_io_request,5,[{file,"user.erl"},{line,182}]},
{user,server_loop,2,[{file,"user.erl"},{line,132}]},
{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},
{initial_call,{erlang,apply,2}},
{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},
{message_queue_len,0},
{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},
{monitors,[]},
{monitored_by,[<5360.48.0>]},
{heap_size,233}]
ok
Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end, that's
why you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying "Error:",
but getting further.
It might be worth indicating that double quotes need to be used, at
least on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'node().'
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Alvaro Videla
2015-07-18 22:24:23 UTC
Permalink
That user process from the maybe stuck call seems to be doing I/O
FWIW, I see a similar mention of the issue I am seeing at
http://stackoverflow.com/questions/21666537/rabbitmq-memory-control-queue-is-full-and-is-not-paging-connection-hangs.
Interestingly, that is on Linux, and a year ago.
Post by Raymond Rizzuto
There are 206 processes.
Investigated 1 processes this round, 5000ms to go.
Investigated 1 processes this round, 4500ms to go.
Investigated 1 processes this round, 4000ms to go.
Investigated 1 processes this round, 3500ms to go.
Investigated 1 processes this round, 3000ms to go.
Investigated 1 processes this round, 2500ms to go.
Investigated 1 processes this round, 2000ms to go.
Investigated 1 processes this round, 1500ms to go.
Investigated 1 processes this round, 1000ms to go.
Investigated 1 processes this round, 500ms to go.
Found 1 suspicious processes.
[{pid,<5360.30.0>},
{registered_name,user},
{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},
{user,do_io_request,5,[{file,"user.erl"},{line,182}]},
{user,server_loop,2,[{file,"user.erl"},{line,132}]},
{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},
{initial_call,{erlang,apply,2}},
{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},
{message_queue_len,0},
{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},
{monitors,[]},
{monitored_by,[<5360.48.0>]},
{heap_size,233}]
ok
Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end, that's
why you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying "Error:",
but getting further.
It might be worth indicating that double quotes need to be used, at
least on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval 'node().'
Post by Michael Klishin
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 -> [::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Alvaro Videla
2015-07-20 11:07:50 UTC
Permalink
Considering you are publishing small messages, you might want to check the
settings of queue_index_embed_msgs_below explained here:
https://www.rabbitmq.com/persistence-conf.html and see how that affects
performance.

If you set queue_index_embed_msgs_below to 0, then small messages won't be
stored in the queue index at all.
Post by Alvaro Videla
That user process from the maybe stuck call seems to be doing I/O
FWIW, I see a similar mention of the issue I am seeing at
http://stackoverflow.com/questions/21666537/rabbitmq-memory-control-queue-is-full-and-is-not-paging-connection-hangs.
Interestingly, that is on Linux, and a year ago.
Post by Raymond Rizzuto
There are 206 processes.
Investigated 1 processes this round, 5000ms to go.
Investigated 1 processes this round, 4500ms to go.
Investigated 1 processes this round, 4000ms to go.
Investigated 1 processes this round, 3500ms to go.
Investigated 1 processes this round, 3000ms to go.
Investigated 1 processes this round, 2500ms to go.
Investigated 1 processes this round, 2000ms to go.
Investigated 1 processes this round, 1500ms to go.
Investigated 1 processes this round, 1000ms to go.
Investigated 1 processes this round, 500ms to go.
Found 1 suspicious processes.
[{pid,<5360.30.0>},
{registered_name,user},
{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},
{user,do_io_request,5,[{file,"user.erl"},{line,182}]},
{user,server_loop,2,[{file,"user.erl"},{line,132}]},
{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},
{initial_call,{erlang,apply,2}},
{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},
{message_queue_len,0},
{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},
{monitors,[]},
{monitored_by,[<5360.48.0>]},
{heap_size,233}]
ok
Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end, that's
why you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying "Error:",
but getting further.
It might be worth indicating that double quotes need to be used, at
least on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval 'node().'
Post by Raymond Rizzuto
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 ->
[::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-20 23:44:51 UTC
Permalink
It isn't a question of performance per se. Since I have only 1 queue bound
to the exchange, each message would be written only once regardless of
whether it is embedded or not.

The issue I am having is falling off the cliff bad. I do not get a memory
alarm, I just see the publisher go into flow control after a several
minutes of flawless performance. I realize that there may be a slowdown
due to persisting messages to disk when the persist level is hit, but I am
not seeing disk writes to back up that theory.

I plan to get back to this later this week and try to capture as much data
from the broker, publisher, stuck processes, etc. to hopefully provide some
insight into possible causes.

Since we sometimes take down consumers for 30 minutes or more in order to
do an update live, it is a key requirement that the message queue continue
to queue messages in the interim.
Post by Alvaro Videla
Considering you are publishing small messages, you might want to check the
https://www.rabbitmq.com/persistence-conf.html and see how that affects
performance.
If you set queue_index_embed_msgs_below to 0, then small messages won't
be stored in the queue index at all.
Post by Alvaro Videla
That user process from the maybe stuck call seems to be doing I/O
FWIW, I see a similar mention of the issue I am seeing at
http://stackoverflow.com/questions/21666537/rabbitmq-memory-control-queue-is-full-and-is-not-paging-connection-hangs.
Interestingly, that is on Linux, and a year ago.
Post by Raymond Rizzuto
There are 206 processes.
Investigated 1 processes this round, 5000ms to go.
Investigated 1 processes this round, 4500ms to go.
Investigated 1 processes this round, 4000ms to go.
Investigated 1 processes this round, 3500ms to go.
Investigated 1 processes this round, 3000ms to go.
Investigated 1 processes this round, 2500ms to go.
Investigated 1 processes this round, 2000ms to go.
Investigated 1 processes this round, 1500ms to go.
Investigated 1 processes this round, 1000ms to go.
Investigated 1 processes this round, 500ms to go.
Found 1 suspicious processes.
[{pid,<5360.30.0>},
{registered_name,user},
{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},
{user,do_io_request,5,[{file,"user.erl"},{line,182}]},
{user,server_loop,2,[{file,"user.erl"},{line,132}]},
{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},
{initial_call,{erlang,apply,2}},
{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},
{message_queue_len,0},
{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},
{monitors,[]},
{monitored_by,[<5360.48.0>]},
{heap_size,233}]
ok
Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end, that's
why you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3.91\sbin>
rabbitmqctl eval "rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying "Error:",
but getting further.
It might be worth indicating that double quotes need to be used, at
least on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the Erlang
command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval 'node().'
Post by Raymond Rizzuto
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 ->
[::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
<javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Alvaro Videla
2015-07-21 15:10:13 UTC
Permalink
Hi,

I ran some tests today and found the following. There seem to be quite a
few things in play:

1) Publishing messages whose size is bellow the
configured queue_index_embed_msgs_below will make those messages be
embedded in the queue index. The default value here is 4096 bytes. I ran my
benchmarks with 1000 bytes messages.
2) The queue index also has a configuration parameter
called queue_index_max_journal_entries which directly affects RAM usage
based on the queue_index_embed_msgs_below settings as explained here:
https://www.rabbitmq.com/persistence-conf.html#index-embedding so there are
advantages and disadvantages to using queue_index_embed_msgs_below.
3) When the queue index journal holds more than queue_index_max_journal_entries
then the journal will be flushed to disk; whenever this happens publishing
performance drops significantly. I've seen my broker go from 32077 1kb
msgs/sec being published down to 1055 msgs/sec. I think this is the problem
you are seeing.

Whether messages are persistent or not, the queue index must be kept
somewhere (initially in RAM), and when the RAM usage grows to much it has
to be paged out to disk, as expected.

Perhaps we could improve performance there somehow, I don't know yet. I
wanted to confirm that I'm seeing a similar issue (if not the same).

Regards,

Alvaro
Post by Raymond Rizzuto
It isn't a question of performance per se. Since I have only 1 queue
bound to the exchange, each message would be written only once regardless
of whether it is embedded or not.
The issue I am having is falling off the cliff bad. I do not get a memory
alarm, I just see the publisher go into flow control after a several
minutes of flawless performance. I realize that there may be a slowdown
due to persisting messages to disk when the persist level is hit, but I am
not seeing disk writes to back up that theory.
I plan to get back to this later this week and try to capture as much data
from the broker, publisher, stuck processes, etc. to hopefully provide some
insight into possible causes.
Since we sometimes take down consumers for 30 minutes or more in order to
do an update live, it is a key requirement that the message queue continue
to queue messages in the interim.
Post by Alvaro Videla
Considering you are publishing small messages, you might want to check
https://www.rabbitmq.com/persistence-conf.html and see how that affects
performance.
If you set queue_index_embed_msgs_below to 0, then small messages won't
be stored in the queue index at all.
That user process from the maybe stuck call seems to be doing I/O
Post by Alvaro Videla
FWIW, I see a similar mention of the issue I am seeing at
http://stackoverflow.com/questions/21666537/rabbitmq-memory-control-queue-is-full-and-is-not-paging-connection-hangs.
Interestingly, that is on Linux, and a year ago.
Post by Raymond Rizzuto
There are 206 processes.
Investigated 1 processes this round, 5000ms to go.
Investigated 1 processes this round, 4500ms to go.
Investigated 1 processes this round, 4000ms to go.
Investigated 1 processes this round, 3500ms to go.
Investigated 1 processes this round, 3000ms to go.
Investigated 1 processes this round, 2500ms to go.
Investigated 1 processes this round, 2000ms to go.
Investigated 1 processes this round, 1500ms to go.
Investigated 1 processes this round, 1000ms to go.
Investigated 1 processes this round, 500ms to go.
Found 1 suspicious processes.
[{pid,<5360.30.0>},
{registered_name,user},
{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},
{user,do_io_request,5,[{file,"user.erl"},{line,182}]},
{user,server_loop,2,[{file,"user.erl"},{line,132}]},
{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},
{initial_call,{erlang,apply,2}},
{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},
{message_queue_len,0},
{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},
{monitors,[]},
{monitored_by,[<5360.48.0>]},
{heap_size,233}]
ok
Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end,
that's why you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
"rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying
"Error:", but getting further.
It might be worth indicating that double quotes need to be used, at
least on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the
Erlang command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval 'node().'
Post by Raymond Rizzuto
Also, the connection is re-establisghed, but more than a minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 ->
[::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-21 15:24:41 UTC
Permalink
In my testing, I tried both 10K/sec and 5k/sec with a size of 1024, and saw
the same behavior with the throughput dropping to 0 for several minutes. I
am running this on a Winodws 7 workstation with 16GB of ram, and a separate
SSD used only for the mnesia db.

I looked at the link you mention, and in rabbitmq.config.example, but I do
not see anything for queue_index_max_journal_entries. What is the default
value? It would be great if there was some logging I could turn on to see
when persistence is kicking in.
Post by Alvaro Videla
Hi,
I ran some tests today and found the following. There seem to be quite a
1) Publishing messages whose size is bellow the
configured queue_index_embed_msgs_below will make those messages be
embedded in the queue index. The default value here is 4096 bytes. I ran my
benchmarks with 1000 bytes messages.
2) The queue index also has a configuration parameter
called queue_index_max_journal_entries which directly affects RAM usage
https://www.rabbitmq.com/persistence-conf.html#index-embedding
<https://www.google.com/url?q=https%3A%2F%2Fwww.rabbitmq.com%2Fpersistence-conf.html%23index-embedding&sa=D&sntz=1&usg=AFQjCNGp6p_opqpraRZ-xbpoKwlMgov_VA>
so there are advantages and disadvantages to using
queue_index_embed_msgs_below.
3) When the queue index journal holds more than queue_index_max_journal_entries
then the journal will be flushed to disk; whenever this happens publishing
performance drops significantly. I've seen my broker go from 32077 1kb
msgs/sec being published down to 1055 msgs/sec. I think this is the problem
you are seeing.
Whether messages are persistent or not, the queue index must be kept
somewhere (initially in RAM), and when the RAM usage grows to much it has
to be paged out to disk, as expected.
Perhaps we could improve performance there somehow, I don't know yet. I
wanted to confirm that I'm seeing a similar issue (if not the same).
Regards,
Alvaro
Post by Raymond Rizzuto
It isn't a question of performance per se. Since I have only 1 queue
bound to the exchange, each message would be written only once regardless
of whether it is embedded or not.
The issue I am having is falling off the cliff bad. I do not get a
memory alarm, I just see the publisher go into flow control after a
several minutes of flawless performance. I realize that there may be a
slowdown due to persisting messages to disk when the persist level is hit,
but I am not seeing disk writes to back up that theory.
I plan to get back to this later this week and try to capture as much
data from the broker, publisher, stuck processes, etc. to hopefully provide
some insight into possible causes.
Since we sometimes take down consumers for 30 minutes or more in order to
do an update live, it is a key requirement that the message queue continue
to queue messages in the interim.
Post by Alvaro Videla
Considering you are publishing small messages, you might want to check
https://www.rabbitmq.com/persistence-conf.html and see how that affects
performance.
If you set queue_index_embed_msgs_below to 0, then small messages won't
be stored in the queue index at all.
That user process from the maybe stuck call seems to be doing I/O
Post by Alvaro Videla
FWIW, I see a similar mention of the issue I am seeing at
http://stackoverflow.com/questions/21666537/rabbitmq-memory-control-queue-is-full-and-is-not-paging-connection-hangs.
Interestingly, that is on Linux, and a year ago.
Post by Raymond Rizzuto
There are 206 processes.
Investigated 1 processes this round, 5000ms to go.
Investigated 1 processes this round, 4500ms to go.
Investigated 1 processes this round, 4000ms to go.
Investigated 1 processes this round, 3500ms to go.
Investigated 1 processes this round, 3000ms to go.
Investigated 1 processes this round, 2500ms to go.
Investigated 1 processes this round, 2000ms to go.
Investigated 1 processes this round, 1500ms to go.
Investigated 1 processes this round, 1000ms to go.
Investigated 1 processes this round, 500ms to go.
Found 1 suspicious processes.
[{pid,<5360.30.0>},
{registered_name,user},
{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},
{user,do_io_request,5,[{file,"user.erl"},{line,182}]},
{user,server_loop,2,[{file,"user.erl"},{line,132}]},
{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},
{initial_call,{erlang,apply,2}},
{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},
{message_queue_len,0},
{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},
{monitors,[]},
{monitored_by,[<5360.48.0>]},
{heap_size,233}]
ok
Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end,
that's why you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead of
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
"rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying
"Error:", but getting further.
It might be worth indicating that double quotes need to be used, at
least on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the
Erlang command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However, I
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval 'node().'
Post by Raymond Rizzuto
Also, the connection is re-establisghed, but more than a
minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 ->
[::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the
Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
Post by Raymond Rizzuto
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Alvaro Videla
2015-07-21 15:29:36 UTC
Permalink
The queue_index_max_journal_entries default value is 65536
Post by Raymond Rizzuto
In my testing, I tried both 10K/sec and 5k/sec with a size of 1024, and
saw the same behavior with the throughput dropping to 0 for several
minutes. I am running this on a Winodws 7 workstation with 16GB of ram,
and a separate SSD used only for the mnesia db.
I looked at the link you mention, and in rabbitmq.config.example, but I do
not see anything for queue_index_max_journal_entries. What is the default
value? It would be great if there was some logging I could turn on to see
when persistence is kicking in.
Post by Alvaro Videla
Hi,
I ran some tests today and found the following. There seem to be quite a
1) Publishing messages whose size is bellow the
configured queue_index_embed_msgs_below will make those messages be
embedded in the queue index. The default value here is 4096 bytes. I ran my
benchmarks with 1000 bytes messages.
2) The queue index also has a configuration parameter
Post by Alvaro Videla
called queue_index_max_journal_entries which directly affects RAM usage
https://www.rabbitmq.com/persistence-conf.html#index-embedding
<https://www.google.com/url?q=https%3A%2F%2Fwww.rabbitmq.com%2Fpersistence-conf.html%23index-embedding&sa=D&sntz=1&usg=AFQjCNGp6p_opqpraRZ-xbpoKwlMgov_VA>
so there are advantages and disadvantages to using
queue_index_embed_msgs_below.
3) When the queue index journal holds more than queue_index_max_journal_entries
Post by Alvaro Videla
then the journal will be flushed to disk; whenever this happens publishing
performance drops significantly. I've seen my broker go from 32077 1kb
msgs/sec being published down to 1055 msgs/sec. I think this is the problem
you are seeing.
Whether messages are persistent or not, the queue index must be kept
somewhere (initially in RAM), and when the RAM usage grows to much it has
to be paged out to disk, as expected.
Perhaps we could improve performance there somehow, I don't know yet. I
wanted to confirm that I'm seeing a similar issue (if not the same).
Regards,
Alvaro
Post by Raymond Rizzuto
It isn't a question of performance per se. Since I have only 1 queue
bound to the exchange, each message would be written only once regardless
of whether it is embedded or not.
The issue I am having is falling off the cliff bad. I do not get a
memory alarm, I just see the publisher go into flow control after a
several minutes of flawless performance. I realize that there may be a
slowdown due to persisting messages to disk when the persist level is hit,
but I am not seeing disk writes to back up that theory.
I plan to get back to this later this week and try to capture as much
data from the broker, publisher, stuck processes, etc. to hopefully provide
some insight into possible causes.
Since we sometimes take down consumers for 30 minutes or more in order
to do an update live, it is a key requirement that the message queue
continue to queue messages in the interim.
Post by Alvaro Videla
Considering you are publishing small messages, you might want to check
https://www.rabbitmq.com/persistence-conf.html and see how that
affects performance.
If you set queue_index_embed_msgs_below to 0, then small messages
won't be stored in the queue index at all.
That user process from the maybe stuck call seems to be doing I/O
Post by Alvaro Videla
FWIW, I see a similar mention of the issue I am seeing at
http://stackoverflow.com/questions/21666537/rabbitmq-memory-control-queue-is-full-and-is-not-paging-connection-hangs.
Interestingly, that is on Linux, and a year ago.
Post by Raymond Rizzuto
There are 206 processes.
Investigated 1 processes this round, 5000ms to go.
Investigated 1 processes this round, 4500ms to go.
Investigated 1 processes this round, 4000ms to go.
Investigated 1 processes this round, 3500ms to go.
Investigated 1 processes this round, 3000ms to go.
Investigated 1 processes this round, 2500ms to go.
Investigated 1 processes this round, 2000ms to go.
Investigated 1 processes this round, 1500ms to go.
Investigated 1 processes this round, 1000ms to go.
Investigated 1 processes this round, 500ms to go.
Found 1 suspicious processes.
[{pid,<5360.30.0>},
{registered_name,user},
{current_stacktrace,[{user,get_chars,8,[{file,"user.erl"},{line,612}]},
{user,do_io_request,5,[{file,"user.erl"},{line,182}]},
{user,server_loop,2,[{file,"user.erl"},{line,132}]},
{user,catch_loop,3,[{file,"user.erl"},{line,99}]}]},
{initial_call,{erlang,apply,2}},
{dictionary,[{encoding,latin1},{read_mode,list},{shell,<5360.31.0>}]},
{message_queue_len,0},
{links,[<5360.28.0>,<5360.31.0>,#Port<5360.384>,<5360.6.0>]},
{monitors,[]},
{monitored_by,[<5360.48.0>]},
{heap_size,233}]
ok
Does that mean anything to you?
Post by Alvaro Videla
The module is called rabbit_diagnostics with an 's' at the end,
that's why you get that error.
undef means, the function you are trying to call is undefined
Post by Raymond Rizzuto
II retried both commands I was having with double quotes instead
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval "node()."
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
"rabbit_diagnostic:maybe_stuck()."
Error: {undef,[{rabbit_diagnostic,maybe_stuck,[],[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,657}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}
The first command works, the second command is still saying
"Error:", but getting further.
It might be worth indicating that double quotes need to be used,
at least on Windows.
Post by 'Robert Raschke' via rabbitmq-users
I think you need to use double, not single, quotes around the
Erlang command to be eval'ed.
I updated to 3.5.4 rc2, and the batch files are better. However,
Post by 'Robert Raschke' via rabbitmq-users
Post by Raymond Rizzuto
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval
'rabbit_diagnostics:maybe_stuck().'
I also tried this example from the manual, with similar results.
C:\Program Files (x86)\RabbitMQ
Server\rabbitmq_server-3.5.3.91\sbin> rabbitmqctl eval 'node().'
On Wednesday, July 15, 2015 at 11:25:38 AM UTC-4, Michael
On 15 July 2015 at 17:55:26, Raymond Rizzuto (
Also, the connection is re-establisghed, but more than a
minute
=INFO REPORT==== 15-Jul-2015::10:40:55 ===
accepting AMQP connection <0.1483.0> ([::1]:55758 ->
[::1]:5672)
I think this may account for why my data rate drops.
OK, this is pretty curious. Thanks.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the
Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
Post by Raymond Rizzuto
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it,
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-24 18:50:52 UTC
Permalink
I did a retest, and still see undesirable behavior.

My environment is Erlang/OTP 17.5, Windows 7 64 bit. 16G ram, 240G SSD
dedicated to just RabbitMQ's persistence files.

Config settings:

{tcp_listeners, [{"127.0.0.1", 5672}]},
{log_levels, [{connection, debug}, {channel, debug}]},
{heartbeat, 60},
{vm_memory_high_watermark, 0.8},
{vm_memory_high_watermark_paging_ratio, 0.5}


Here's what I did:


1. Purged the queue
2. Stopped rabbitmq server
3. deleted log files
4. Started rabbitmq server
5. ran the command for stuck processes, it indicated 1 suspicious
process user in user.erl
6. checked memory, 41MB in use, high watermark of 13GB
7. started .Net process on same PC publishing 1k messages @ 5K/sec
8. ~7 minutes later, throughput dropped to 0 for about 1 minute
9. about halfway through that minute, the .net application logged this
exception: System.IO.IOException: Unable to write data to the transport
connection: A connection attempt failed because the connected party did not
properly respond after a period of time, or established connection failed
because connected host has failed to respond. --->
System.Net.Sockets.SocketException: A connection attempt failed because the
connected party did not properly respond after a period of time, or
established connection failed because connected host has failed to respond
10. At about the same time, I saw 2 disk writes for the queue, and
afterwards the .net process started again to publish
11. There were no memory alarms in the rabbitmq log, and memory usage as
reported by the web ui was only ~5G

Let me know if there is any additional logging I can turn on or detail I
can provide.
Post by Alvaro Videla
The queue_index_max_journal_entries default value is 65536
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-30 00:29:47 UTC
Permalink
Any suggestions? Having queuing stop for a minute isn't acceptable in our
environment, and I can't see any reason why the publisher would be flow
controlled off when memory use has not hit the high water mark.
Post by Raymond Rizzuto
I did a retest, and still see undesirable behavior.
My environment is Erlang/OTP 17.5, Windows 7 64 bit. 16G ram, 240G SSD
dedicated to just RabbitMQ's persistence files.
{tcp_listeners, [{"127.0.0.1", 5672}]},
{log_levels, [{connection, debug}, {channel, debug}]},
{heartbeat, 60},
{vm_memory_high_watermark, 0.8},
{vm_memory_high_watermark_paging_ratio, 0.5}
1. Purged the queue
2. Stopped rabbitmq server
3. deleted log files
4. Started rabbitmq server
5. ran the command for stuck processes, it indicated 1 suspicious
process user in user.erl
6. checked memory, 41MB in use, high watermark of 13GB
8. ~7 minutes later, throughput dropped to 0 for about 1 minute
9. about halfway through that minute, the .net application logged
this exception: System.IO.IOException: Unable to write data to the
transport connection: A connection attempt failed because the connected
party did not properly respond after a period of time, or established
connection failed because connected host has failed to respond. --->
System.Net.Sockets.SocketException: A connection attempt failed because the
connected party did not properly respond after a period of time, or
established connection failed because connected host has failed to respond
10. At about the same time, I saw 2 disk writes for the queue, and
afterwards the .net process started again to publish
11. There were no memory alarms in the rabbitmq log, and memory usage
as reported by the web ui was only ~5G
Let me know if there is any additional logging I can turn on or detail I
can provide.
Post by Alvaro Videla
The queue_index_max_journal_entries default value is 65536
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Klishin
2015-07-30 01:11:46 UTC
Permalink
Any suggestions? Having queuing stop for a minute isn't acceptable
in our environment, and I can't see any reason why the publisher
would be flow controlled off when memory use has not hit the high
water mark.
Raymond,

The ratio of the high VM watermark after which data will be moved to disk
“in bulk” is configurable and is 0.5 by default. You can try bumping that to 0.9 or so.

We’ve identified the function that has the most effect on throughput. There are
no conclusions on how easy it may be improve this yet.

There’s a GitHub issue you can watch:
https://github.com/rabbitmq/rabbitmq-server/issues/227 

Our team is tiny and it’s holiday season. We will get to it as time permits.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-30 01:56:07 UTC
Permalink
I understand about the small team and holiday season, ditto here. I'm the
only developer working on the task to convert from MSMQs to Rabbitmq, and
this is only one of several projects I have to work on. Some days
multitasking stinks!

I looked at the github issue, and that may apply. What is odd, however, is
that the io graph on the rabbitmq web UI shows 2 small bursts during the 1
minute lull in performance. writing 5G to an SSD shouldn't take a minute,
and the io graph seems to concur. Is the time due to serialization of
object to prior to I/O?

Ideally the bulk move should overlap receipt of more data, so setting the
disk write to 50% of the high water mark should allow more time to write
before hitting the high water mark.

I will circle back to trying persistent messages. They should be persisted
earlier, if I understand correctly, so hitting the write to disk threshold
should require less writes. Is that accurate?
Post by Michael Klishin
Any suggestions? Having queuing stop for a minute isn't acceptable
in our environment, and I can't see any reason why the publisher
would be flow controlled off when memory use has not hit the high
water mark.
Raymond,
The ratio of the high VM watermark after which data will be moved to disk
“in bulk” is configurable and is 0.5 by default. You can try bumping that
to 0.9 or so.
We’ve identified the function that has the most effect on throughput.
There are
no conclusions on how easy it may be improve this yet.
https://github.com/rabbitmq/rabbitmq-server/issues/227
Our team is tiny and it’s holiday season. We will get to it as time
permits.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Rizzuto
2015-07-14 15:28:22 UTC
Permalink
Right, but if they are already persisted, and later need to be removed from
memory due to pressure, I would think that no write would be required, and
hence no long delay do to disk IO like I see in the case with nonpersistent
messages.
Post by Michael Klishin
If I understand correctly, they get persisted as they are received,
so there shouldn't be any big pause to persist a 1 million messages
due to memory pressure.
Messages can be kept in RAM even after they are moved to disk.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+***@googlegroups.com.
To post to this group, send an email to rabbitmq-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...