Debriefing > Bug Reports & Fixes
[BUG] Server hangs periodically
killermonkey:
Ok, so here is my plan. We need to narrow down when the hang/cpu load is occurring. This will help me identify the source of the problem. Also, if there is no general time/event that this is occurring on, we can safely say its HLDS acting foolish.
I need the following data:
- When in the match life cycle does it hang: Map load, First Round, Round Change, Match End?
- What gameplay is active?
- Was teamplay active?
- How many players were in the server?
- Do you have any mods/plugins running?
mookie:
Samples of what I can see in log files, scanning by eye for consecutive non-mapchange log files. Unfortunately log files get truncated by any kind of crash or forced termination, so the last lines are pretty much random, and whatever was in the buffer is lost. I'll scan up in my HLSW next time it happens to see what can be seen as far as timing. In the past though I didn't notice anything unusual, just normal kills, connects, disconnects.
Deathmatch, FFA, round timer set to 420, round end time set to 12 (so 7:12 per round). Time limit is 28 minutes.
Source MetaMod 1.8.4, SourceMod 1.3.6, no other plugins, only using stock SM scripts. SourceMod logs errors for ge_bunker_classic only, something to do with entity outputs (almost certainly unrelated to this).
Started ge_facility at 01:01:53, log ends 01:14:52 (12:59+).
Started ge_facility_backzone at 03:25:19, log ends 03:40:03 (14:44+).
Started ge_silo at 20:58:03, log ends at 21:04:44 (6:41+).
Started ge_facility at 18:44:32, log ends at 19:03:46 (19:14+).
Started ge_caves at 21:08:22, log ends at 21:29:33 (21:11+).
Started ge_complex at 18:42:35, log ends at 18:48:51 (6:16+).
Started ge_archives at 15:58:17, log ends at 16:06:33 (8:16+).
Started ge_archives at 21:10:31, log ends at 21:17:56 (7:25+).
Started ge_casino at 02:03:14, log ends at 02:17:51 (14:37+).
If you figure there might be anywhere from zero to three minutes of data in the buffer when it goes down, it could be right after a round start. The only time I remember being on when this happened, it happened within ten seconds of a round starting. Then again, if you allow a fudge factor that's over 40% of your unit of measure, you can make almost anything fit a pattern.
I see several times Luchador being dropped for no Steam logon among the last few lines. Probably unrelated.
Already using -debug, no dump files.
-game gesource -ip 216.52.143.68 +maxplayers 16 -debug -console -nocrashdialog -norestart -tickrate 66
Edit:
Attached a log file appended with what appears in my HLSW. 14:09 from start to last logged event. Unfortunately I don't see things like round end and round start getting logged so there's no way to tell if it got past the end of the round or went down with seconds to spare.
Edit2:
Another log file with truncated part appended. 20:39 from start to last logged event, so unless there could be 45 seconds without a kill happening (unlikely since the server was fullish) anything directly related to round changing looks very unlikely. Also, there was almost five minutes of data lost in the buffer, meaning all the fancy subtraction I did above is most likely meaningless.
Edit3:
Another log file. 17:18 from start to last logged event. Roughly halfway through the third round.
Mark [lodle]:
It could be the garbage collector in python kicking in. Esp for game mods that create a lot of objects.
If you can,
run srcds using gdb and then when it hangs hit ctrl c which will pause it. At that point type: generate-core-file and send us the file so we can have a look at what its doing.
Edit:
If your using windows you can create a crash dump via task explorer (right click on srcds.exe and select create dump file)
Edit:
A unix tool you can use is http://htop.sourceforge.net/ to monitor srcds
mookie:
OK I got one right now, do I do a minidump or a full dump?
killermonkey:
Full dump if you can as that'll include variable values
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version