Here it stops again at approx 10:44 a.m. EST. Again UDP:514 is 65550. Packets are still coming. No logging. Tried logging via logger and no messages. 10:53 a.m. EST it starts again. vmstat doesn't indicate much disk activity blocking either in or out either before or after. After it picks up again the messages I logged via logger are logged. Utilization on the box is pretty steady. Swap neither grows, nor diminishes. Memory is used for the most part, maybe about 10MB free. Lots of shared memory. Graphs show loopback at 10:45 is quite active comparitively. This is from a dump of an rrdtool database monitoring loopback utilization. This doesn't correlate to other times when I've seen the problem though, specifically when it stopped logging at 10:10 p.m. EST. Normall readings for the graph are around 2.e+01. <!-- 2001-03-29 10:40:00 EST --> <row><v> 2.8297444075e+01 </v><v> 2.8297444075e+01 </v></row> <!-- 2001-03-29 10:45:00 EST --> <row><v> 5.7677917503e+01 </v><v> 5.7677917503e+01 </v></row> <!-- 2001-03-29 10:50:00 EST --> <row><v> 9.0888004719e+01 </v><v> 9.0888004719e+01 </v></row> POLLIN}, {fd=7, events=0}, {fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN}], 17, 100) = 0 poll([{fd=12, events=0}, {fd=11, events=0}, {fd=21, events=0}, {fd=10, events=0}, {fd=22, events=POLLIN}, {fd=14, events=0}, {fd=13, events=POLLIN}, {fd=17, events=0}, {fd=16, events=POLLIN}, {fd=15, events=0}, {fd=9, events=0}, {fd=8, events=POLLIN}, {fd=7, events=0}, {fd=6, events=POLLIN}, {fd=5, events=POLLIN, revents=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN}], 17, 6000) = 1 recvfrom(5, "<133>559901: 4w3d: %UBR7200-5-MA"..., 1024, 0, {sin_family=AF_INET, sin_port=htons(49364), sin_addr=inet_addr("24.247.48.35")}}, [16]) = 163 time(NULL) = 985880402 time(NULL) = 985880402 time(NULL) = 985880402 time(NULL) = 985880402 poll([{fd=12, events=0}, {fd=11, events=0}, {fd=21, events=0}, {fd=10, events=0}, {fd=22, events=POLLIN}, {fd=14, events=0}, {fd=13, events=POLLIN}, {fd=17, events=0}, {fd=16, events=POLLIN}, {fd=15, events=0}, {fd=9, events=0}, {fd=8, events=POLLIN}, {fd=7, events=POLLOUT, revents=POLLOUT}, {fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN}], 17, 100) = 1 write(7, "Mar 29 10:40:02 24.247.48.35 559"..., 188) = 188 time(NULL) = 985880402 poll([{fd=12, events=0}, {fd=11, events=0}, {fd=21, events=0}, {fd=10, events=0}, {fd=22, events=POLLIN}, {fd=14, events=0}, {fd=13, events=POLLIN}, {fd=17, events=0}, {fd=16, events=POLLIN}, {fd=15, events=0}, {fd=9, events=0}, {fd=8, events=POLLIN}, {fd=7, events=0}, {fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN}], 17, 100) = 0 poll([{fd=12, events=0}, {fd=11, events=0}, {fd=21, events=0}, {fd=10, events=0}, {fd=22, events=POLLIN}, {fd=14, events=0}, {fd=13, events=POLLIN}, {fd=17, events=0}, {fd=16, events=POLLIN}, {fd=15, events=0}, {fd=9, events=0}, {fd=8, events=POLLIN}, {fd=7, events=0}, {fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN, revents=POLLIN}], 17, 4000) = 1 read(3, I don't know. Nothing to correlate it to stopping at 10:10:00 p.m. EST last night though... I'm just not seeing anything that sticks out to indicate where a problem might be. Can someone who is running a central logging server indicate how they have configured syslog-ng, as far as log_fifo_size, garbage collection options, number of objects alive at any one time after running for a while, sync options? I would appreciate it, greatly. Thanks, Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
poll([{fd=12, events=0}, {fd=11, events=0}, {fd=21, events=0}, {fd=10, events=0}, {fd=22, events=POLLIN}, {fd=14, events=0}, {fd=13, events=POLLIN}, {fd=17, events=0}, {fd=16, events=POLLIN}, {fd=15, events=0}, {fd=9, events=0}, {fd=8, events=POLLIN}, {fd=7, events=0}, {fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN}], 17, 100) = 0 poll([{fd=12, events=0}, {fd=11, events=0}, {fd=21, events=0}, {fd=10, events=0}, {fd=22, events=POLLIN}, {fd=14, events=0}, {fd=13, events=POLLIN}, {fd=17, events=0}, {fd=16, events=POLLIN}, {fd=15, events=0}, {fd=9, events=0}, {fd=8, events=POLLIN}, {fd=7, events=0}, {fd=6, events=POLLIN}, {fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN, revents=POLLIN}], 17, 4000) = 1 read(3,
I don't know. Nothing to correlate it to stopping at 10:10:00 p.m. EST last night though...
I'm just not seeing anything that sticks out to indicate where a problem might be.
Can someone who is running a central logging server indicate how they have configured syslog-ng, as far as log_fifo_size, garbage collection options, number of objects alive at any one time after running for a while, sync options? I would appreciate it, greatly.
something seems to block syslog-ng. DNS maybe? syslog-ng issues a dns lookup for every message if DNS is turned on. What is fd=3 ? It indicates readability, but as it seems reading from it blocks. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
On Thu, 29 Mar 2001, Balazs Scheidler wrote:
something seems to block syslog-ng. DNS maybe? syslog-ng issues a dns lookup for every message if DNS is turned on. What is fd=3 ? It indicates readability, but as it seems reading from it blocks.
How would I determine what that correlates too? Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
On Thu, Mar 29, 2001 at 12:43:36PM -0500, Brian E. Seppanen wrote:
On Thu, 29 Mar 2001, Balazs Scheidler wrote:
something seems to block syslog-ng. DNS maybe? syslog-ng issues a dns lookup for every message if DNS is turned on. What is fd=3 ? It indicates readability, but as it seems reading from it blocks.
How would I determine what that correlates too?
if you have the full strace log from startup, you should see where syslog-ng opens fd #3 by examining the return values from open() or socket(). If you don't, you could still check /proc/<pid>/fd/ directory where the symlink named #3 points to, or use lsof -p <pid> -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
On Fri, 30 Mar 2001, Balazs Scheidler wrote:
if you have the full strace log from startup, you should see where syslog-ng opens fd #3 by examining the return values from open() or socket(). If you don't, you could still check /proc/<pid>/fd/ directory where the symlink named #3 points to, or use lsof -p <pid>
You learn something new every day. That explains so much that I didn't understand about proc earlier... I had no idea those numeric directories correlated with pids, it just escaped me. Thanks, It seems to do it on fd3 regardless of what it correlates too. Earlier this was set as a filter to match on a specific entry. lsof -p indicates that it is now associated with /proc/kmsg. I've removed the specific filter. In fact I've cleaned up a lot of unnecessary junk. However, it still seems to block on that file descriptor read at various times. It normally restarts though. However, last night I killed the strace, cleaned up my config and thought I'd be okay and restarted it in verbose mode. I left it running in verbose mode, and at 6:30 a.m. it was doing idle garbage collection every second. I had 900 objects alive, and then it just stopped. No more logging. Process still running, no more garbage collection, nothing. The possibility remains that I missread the strace output earlier and that /proc/kmesg has always been associated with fd3, and there could be some problems with reading for kernel messages at certain points... It would make more sense that /proc/kmesg would be assocatied with a low number file descriptor than a specific filter, so I'm leaning toward that theory. Is it possible for me to run syslog and syslog-ng concurrently? I'm thinking what I could do as a workaround and test case is run syslog to capture kernel messages only, and leave syslog-ng to log everything else. I also noticed that when I HUPped the process, while stracing, debugging and verbose I get tons of these messages. And they repeat a lot and frequently. These are some examples. ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (43) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (41) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (42) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (43) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (44) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (42) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (43) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (44) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (45) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (43) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (44) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (45) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (46) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (44) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (45) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (46) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (47) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (45) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (46) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (47) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (48) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (46) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (47) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (48) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (49) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (47) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (48) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (49) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (50) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (48) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (49) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (50) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (51) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (49) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (50) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (51) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (52) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (50) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (51) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (52) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (53) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (51) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (52) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (53) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (54) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (52) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (53) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (54) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (55) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (53) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (54) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (55) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (56) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (54) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (55) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (56) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (57) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (55) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (56) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (57) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (58) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (56) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (57) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (58) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (59) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (57) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (58) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (59) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (60) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (58) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (59) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (60) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (61) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (59) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (60) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (61) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (62) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (60) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (61) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (62) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (63) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (61) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (62) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (63) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (64) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (62) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (63) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (64) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (65) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (63) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (64) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (65) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (66) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (64) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (65) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (66) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (67) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (65) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (66) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (67) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (68) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (66) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (67) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (68) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (69) ) = 58 write(2, "gc_mark: Marking object of class"..., 55gc_mark: Marking object of class 'log_connection' (67) ) = 55 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (68) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (69) ) = 58 write(2, "gc_mark: Marking object of class"..., 58gc_mark: Marking object of class 'log_endpoint_info' (70) ) = 58 Thanks, Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
My logging worked just fine this entire weekend without any problems, after I disabled the /proc/kmesg source. Any ideas on what might be causing syslog-ng to not like receiving kernel messages? I'm running redhat-7.0 with linux-2.2.16-22enterprise, which provides SMP support. We recently upgraded this from a single processor box to a dual processor box. Any chance that could affect this? Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
On Mon, 2 Apr 2001, Brian E. Seppanen wrote:
My logging worked just fine this entire weekend without any problems, after I disabled the /proc/kmesg source. Any ideas on what might be causing syslog-ng to not like receiving kernel messages? I'm running redhat-7.0 with linux-2.2.16-22enterprise, which provides SMP support. We recently upgraded this from a single processor box to a dual processor box. Any chance that could affect this?
Does anyone have any recommendations? I really like using syslog-ng. I don't want to ditch everything I've put into it. I'd rather not have to ignore kernel messages to continue to use it though. I'm not going to be able to get any more info than I've already provided. Doing a strace with debugging eventually +end up with it hanging while reading on file descriptor three which is : COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME syslog-ng 11791 root 3r REG 0,1 0 5 /proc/kmsg I don't get any further information from it to indicate where the problem might be. In fact I left it going all last night. It stopped on fd3 again, no further details. Not only that, but SSH quit responding as I had started the strace over an SSH connection. I tried logging in via console and my logins would time out. Other processes continued to work as expected. Graphs don't show any extraordinary info. Got to my workstation ended the strace, and I can SSH and login again. I've run it for days without problems but only when not logging kernel messages. Once I re-enable kernel messages it starts dying. I don't believe I had this problem before so I think I might downgrade from 1.4.11 back to 1.4.10. Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
Does anyone have any recommendations? I really like using syslog-ng. I don't want to ditch everything I've put into it. I'd rather not have to ignore kernel messages to continue to use it though. I'm not going to be able to get any more info than I've already provided. Doing a strace with debugging eventually +end up with it hanging while reading on file descriptor three which is :
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME syslog-ng 11791 root 3r REG 0,1 0 5 /proc/kmsg
I don't get any further information from it to indicate where the problem might be. In fact I left it going all last night. It stopped on fd3 again, no further details. Not only that, but SSH quit responding as I had started the strace over an SSH connection. I tried logging in via console and my logins would time out. Other processes continued to work as expected. Graphs don't show any extraordinary info. Got to my workstation ended the strace, and I can SSH and login again.
I've run it for days without problems but only when not logging kernel messages. Once I re-enable kernel messages it starts dying. I don't believe I had this problem before so I think I might downgrade from 1.4.11 back to 1.4.10.
hm... 1.4.10 was running without problems? I might check the difference. hanging on /proc/kmsg might be a kernel bug in the kernel you are running, poll() indicates readability, but reading on the fd fails (even though it's in non-blocking mode) You may try to use klogd, and use that to direct kernel logs to syslog-ng. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
You may try to use klogd, and use that to direct kernel logs to syslog-ng.
There's no way for me to say this without sounding like a complete idiot, but I thought klogd needed to be running to direct messages to /proc/kmesg, which syslog-ng would then read? If this isn't the case then for all I know I've been creating contention for that file by having both syslog-ng reading from it as well as having klogd running. There may be no trouble with syslog-ng other than my ignorance. I've stopped klogd and I'll see if I still get kernel messages, thankfully it won't take long as we usually see a bad ICMP packet or two during the day which the kernel logs. Thanks, Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
On Thu, Apr 05, 2001 at 11:31:19AM -0400, Brian E. Seppanen wrote:
You may try to use klogd, and use that to direct kernel logs to syslog-ng.
There's no way for me to say this without sounding like a complete idiot, but I thought klogd needed to be running to direct messages to /proc/kmesg, which syslog-ng would then read?
Don't beat yourself up. Technology moves so quickly sometimes that its hard to keep it all straight. The Linux kernel is certainly "guilty" of this. If you don't keep up with LKML or at least read all the changelog files, you'll never really understand all the nuances of the evolving OS. Let us know if your problem clears up. ;-) If anything, you've pointed out something that might be good to note in the FAQ or the INSTALL docs. -- Chad Walstrom <chewie@wookimus.net> | a.k.a. ^chewie http://www.wookimus.net/ | s.k.a. gunnarr Key fingerprint = B4AB D627 9CBD 687E 7A31 1950 0CC7 0B18 206C 5AFD
the evolving OS. Let us know if your problem clears up. ;-) If anything, you've pointed out something that might be good to note in the FAQ or the INSTALL docs.
It appears that my problem was running klogd, and using syslog-ng to pick off messages from /proc/kmesg. I've been getting kernel messages just fine, and my logging has continued working overnight. If a problem were to have shown up, it would have already done so. Thanks, Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
On Thu, Apr 05, 2001 at 11:31:19AM -0400, Brian E. Seppanen wrote:
You may try to use klogd, and use that to direct kernel logs to syslog-ng.
There's no way for me to say this without sounding like a complete idiot, but I thought klogd needed to be running to direct messages to /proc/kmesg, which syslog-ng would then read? If this isn't the case then for all I know I've been creating contention for that file by having both syslog-ng reading from it as well as having klogd running. There may be no trouble with syslog-ng other than my ignorance. I've stopped klogd and I'll see if I still get kernel messages, thankfully it won't take long as we usually see a bad ICMP packet or two during the day which the kernel logs.
Ops, that may cause a race condition. Both klogd and syslog-ng tries to read kmsg, and if klogd wins, syslog-ng stays blocked. I don't know why the fd is not in non-blocking mode though. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
something seems to block syslog-ng. DNS maybe? syslog-ng issues a dns lookup for every message if DNS is turned on. What is fd=3 ? It indicates readability, but as it seems reading from it blocks.
I should have mentioned in my earlier email that I have DNS lookups disabled. Brian Seppanen Charter Communications Regional Data Center 906-228-4226 ext 23 Marquette, MI seppy@chartermi.net
participants (3)
-
Balazs Scheidler
-
Brian E. Seppanen
-
Chad C. Walstrom