Trying to add a interface poller for a Cisco UBR 10k. I created the UnDP poller to poll for SNR on an upstream interface but the poller is not collecting correctly. Checking the NPM.jobs.log file there is a Timeout error and when collecting a packet capture from the server I can see the snmp request and reply with valid data. Also in the UnDP gui under "all defined pollers" my leaf is blue instead of green. Any ideas what the issue is?
Thanks
NPM 9.0sp1
Error from log
2008-07-28 17:20:21,578 [13] DEBUG SolarWinds.NPM.Jobs.SnmpJob - { Execute entered2008-07-28 17:20:21,578 [13] DEBUG SolarWinds.NPM.Jobs.SnmpJob - { ExecuteInternal entered2008-07-28 17:20:21,578 [13] DEBUG SolarWinds.NPM.Jobs.SnmpJob - } ExecuteInternal exited2008-07-28 17:20:21,578 [13] DEBUG SolarWinds.NPM.Jobs.SnmpJob - Returning 1 results.2008-07-28 17:20:21,578 [13] DEBUG SolarWinds.NPM.Jobs.SnmpJob - Results: <?xml version="1.0" encoding="utf-16"?><SnmpJobResults xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Results> <SnmpResult> <ResultType>Error</ResultType> <OIDs /> <Labels /> <RequestID>acb3c02c-5566-43d3-9670-51c1bbb6af4f</RequestID> <ErrorMessage>31040: Timeout </ErrorMessage> </SnmpResult> </Results></SnmpJobResults>2008-07-28 17:20:21,578 [13] DEBUG SolarWinds.NPM.Jobs.SnmpJob - } Execute exited
INFO from NPM.Businesslayer.log
2008-07-29 00:02:12,296 [17] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT JobID FROM InterfaceCustomPollerJobs WHERE InterfaceID=@interface (interface=113)2008-07-29 00:02:12,296 [17] INFO SolarWinds.NPM.BusinessLayer.JobSchedulerEventsService - Error: 31040: Timeout 2008-07-29 00:02:12,296 [17] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT CustomPollerAssignmentID, AssignmentName, CustomPollerID, NodeID, InterfaceID FROM CustomPollerAssignment WHERE CustomPollerAssignmentID=@id (@id=72c19513-30fe-4aec-80e8-ad6055c811b7)2008-07-29 00:02:12,296 [17] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT JobID FROM InterfaceCustomPollerJobs WHERE InterfaceID=@interface (interface=113)2008-07-29 00:02:12,296 [17] INFO SolarWinds.NPM.BusinessLayer.JobSchedulerEventsService - Error: 31040: Timeout 2008-07-29 00:02:12,296 [23] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT * FROM Nodes WHERE NodeID=@NodeId (@NodeId=1)2008-07-29 00:02:12,296 [23] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT * FROM Interfaces WHERE NodeID=@NodeId (@NodeId=1)2008-07-29 00:02:12,312 [23] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT * FROM Volumes WHERE NodeID=@NodeId (@NodeId=1)2008-07-29 00:02:12,312 [23] DEBUG SolarWinds.NPM.BusinessLayer.JobSchedulerEventsService - Job 25d0f736-4f34-405c-aba0-6b532aaca0b2 finished for node 1 at 07/29/2008 05:02:07.2008-07-29 00:02:12,312 [23] INFO SolarWinds.NPM.BusinessLayer.JobSchedulerEventsService - Job 25d0f736-4f34-405c-aba0-6b532aaca0b2 returned 3 results.2008-07-29 00:02:12,312 [23] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT CustomPollerAssignmentID, AssignmentName, CustomPollerID, NodeID, InterfaceID FROM CustomPollerAssignment WHERE CustomPollerAssignmentID=@id (@id=f85d0a18-6163-4371-9ba0-45e68ea2a8ae)2008-07-29 00:02:12,312 [23] DEBUG SolarWinds.Orion.Common.SqlHelper - SQL: SELECT JobID FROM InterfaceCustomPollerJobs WHERE InterfaceID=@interface (interface=114)2008-07-29 00:02:12,312 [23] DEBUG SolarWinds.NPM.BusinessLayer.JobSchedulerEventsService - 1.3.6.1.2.1.10.127.1.1.4.1.4.195 = 116
bwicks: Also in the UnDP gui under "all defined pollers" my leaf is blue instead of green. Any ideas what the issue is?
Also in the UnDP gui under "all defined pollers" my leaf is blue instead of green. Any ideas what the issue is?
Yes i get test results in the wizard when setting up the UnDP. The only time I get A result is when i restart both the Job engine and Job Scheduler service. After that I get the Timeout errors in the log file however when running a packet capture on the server i see both the snmp get and snmp replies with valid data. So I not sure what the timeout is referring to...it does not seem to be snmp timeout to me
I am having the same problems. When I do a test poll from within UnDP I get good results - no timeouts. But when the poller is applied I get gaps in my data all over the place. I am getting this with the same upstream SNR poller for Cisco CMTS' and I was getting it with a couple others I tested last week.
SW is unable to find the issue. Which i don't understand. I see the SNMP gets and responses from the device yet the log says timeout. Looks like they are not going to do anything about it. Adjusting the polling interval did not help any.
Note from SW on there findings
After much research and testing we have not been able to reproduce this issue with the interface mibs which included the mibs you selected. This is not an issue with the software or the universal poller directly. The issue reflects either the mib value return and how ofter on the device in question. Even you have created additional pollers that do not reflect the issue.
The initial suggestion I received when I called SW support was to readjust polling times to the default 2 minutes for status. I am at 5 minutes for statistics. When I adjusted node and interface status polling to 90 seconds my results improved, only slightly. I am still seeing several 30-45 minute gaps in the UnDP results for SNR on all CMTS'. This poller is currently applied to ~1500 interfaces, but the matter did not change when I only applied it to about 30.
Other UnDP's I have applied on nodes work fine. I have one other UnDP applied to four Cisco 7609 Gig interfaces and a transform for the results. It is working perfectly fine as well.
Has applying SP2 made a difference?
http://thwack.com/forums/t/10177.aspx
No applying SP2 did not make a difference. Still getting timeout errors in the log even after seeing the SNMP get and reply from the device.
nmpJobResults xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Results> <SnmpResult> <ResultType>Error</ResultType> <OIDs /> <Labels /> <RequestID>dfef3c2d-db84-4a07-afd8-98175f7ae97e</RequestID> <ErrorMessage>31040: Timeout </ErrorMessage> </SnmpResult> </Results></SnmpJobResults>
SP2 did not change anything for me either.
I just opened Case #55374
This is working now with NPM 9.1.. Thanks Solarwinds!! NPM 9.1 is working Great