Page 1 of 1

[RESOLVED] [BAN] Temporary ban to protect map-server, 151 characters affected

Posted: 28 Dec 2018, 07:39
by Freeyorp101
Over the last few days the map server has suffered repeated oomkills due to the write queue of too many sockets becoming too big all at once.

Typical incident:

Code: Select all

Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 47 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 74 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 32 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 43 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 63 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 24 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 13 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 69 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 19 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 27 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 78 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw kernel: kworker/u8:1 invoked oom-killer: gfp_mask=0x15080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), nodemask=(null), order=1, oom_score_adj=0
[oomkiller details]
Dec 28 04:28:50 tmw kernel: Out of memory: Kill process 3968 (tmwa-map) score 804 or sacrifice child
Dec 28 04:28:50 tmw kernel: Killed process 3968 (tmwa-map) total-vm:11620596kB, anon-rss:11536760kB, file-rss:1204kB, shmem-rss:0kB
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 23 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 15 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 53 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 54 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 58 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 25 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 112 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 81 wdata expanded to 134217728 bytes.
[delayed log output of many, *many* other sockets also increasing their write queue]
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 46 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 90 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 50 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 56 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 120 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 95 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 34 wdata expanded to 134217728 bytes.
Dec 28 04:28:50 tmw tmwa-map[3968]: socket: 79 wdata expanded to 134217728 bytes.
Dec 28 04:28:51 tmw kernel: oom_reaper: reaped process 3968 (tmwa-map), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
The following characters are affected:

Code: Select all

'Aamon'
'Abezethibou'
'Abraxax'
'Abyzou'
'Adrammelech'
'Aeshma'
'Agaliarept'
'Agiel'
'Agrat'
'Alloces'
'Allu'
'Amaymon'
'Amdusias'
'Anammelech'
'Ancitif'
'Andhaka'
'Andrealphus'
'Anzu'
'Armaros'
'Arunasura'
'Asag'
'Asakku'
'Bael'
'Bakasura'
'Baku'
'Balberith'
'Bali'
'Barbas'
'Barbatos'
'Barong'
'Bathin'
'Beleth'
'Berith'
'Bifrons'
'Botis'
'Buer'
'Bukavac'
'Bune'
'Bushyasta'
'Chax'
'Chemosh'
'Cimejes'
'Corson'
'Crocell'
'Culsu'
'Daeva'
'Dajjal'
'Danjal'
'Dantalion'
'Decarabia'
'Demiurge'
'Drekavac'
'Dzoavits'
'Eblis'
'Eisheth'
'Eligos'
'Foras'
'Forcas'
'Forneus'
'Forras'
'Furcas'
'Gaap'
'Gaderel'
'Gaki'
'Gamigin'
'Glasya'
'Gremory'
'Grigori'
'Gualichu'
'Guayota'
'Haagenti'
'Haborym'
'Hauras'
'Haures'
'Havres'
'Hinn'
'Ipos'
'Jikininki'
'Kabandha'
'Kasadya'
'Killakee'
'Kroni'
'Kukudh'
'Kumbhakarna'
'Lechies'
'Lempo'
'Leraie'
'Leraje'
'Leyak'
'Lilin'
'Ljubi'
'Lucifuge'
'Marchosias'
'Maricha'
'Masih'
'Mastema'
'Merihem'
'Morax'
'Murmur'
'Naamah'
'Naberus'
'Namtar'
'Ninurta'
'Onoskelis'
'Ordog'
'Orias'
'Oriax'
'Orobas'
'Otokata'
'Paimon'
'Pelesit'
'Penemue'
'Phenex'
'Pithius'
'playerone'
'Pocong'
'Pontianak'
'Preta'
'Pruflas'
'Puloman'
'Rahab'
'Rakshasa'
'Rangda'
'Ronove'
'Rusalka'
'Sabnock'
'Saleos'
'Seir'
'Semyaza'
'Shedim'
'Sitri'
'Sthenno'
'Stihi'
'Stolas'
'Suanggi'
'Surgat'
'Titivillus'
'Toyol'
'Tuchulcha'
'Ukobach'
'Valac'
'Valefar'
'Vapula'
'Vassago'
'Vepar'
'Vine'
'Wechuge'
'Yeqon'
'Zagan'
'Zepar'
'Ziminiar'
The IP from which all of these characters have been connecting from has been temporarily banned.

This is a technical ban rather than a personal ban, made to protect the uptime of the server. The intention is this ban will be reversed once the owner can be contacted and told to tone it down a bit, and/or sane sendq/recvq limits are implemented on the server. (Any evasion prior to contact, however, *is* potentially liable for a personal ban.)

This is completely separate from any actions the game masters may or may not decide to take independently, based on their best judgment and any particulars of the case.
#themanaworld wrote: (16:58:57) < John_H> Anyone up?
(17:01:13) < John_H> If there is a dev/admin on, the server has been kicking players a lot for the last couple of days
[...]
(18:31:26) < Freeyorp> If John_H comes back and sticks around for any amount of time, let them know that the map server got dunked on by the oomkiller about half an hour before they sent their messages
(18:32:48) < Freeyorp> (and about 5.5h before that, and about 2.5h before /that/... etc, etc)
(18:44:08) < Freeyorp> https://ncry.pt/p/GxOn#Vm05i9Tt5Rkw9LKf ... cfBW0ugU2A ... those make for some pretty inflated buffers all in the moments right before oomkiller
(18:45:14) < Freeyorp> 11 connections wanting a 134M buffer? Yeah, that'd do it
(18:46:40) < Freeyorp> shitloads of accounts connecting in the seconds before oomkill, all in alphabetical order, as if read out from a dictionary somewhere... yeah, looks like a DoS
(18:48:37) < gumi> that's not really an intentional DoS, it's just a player that has been experimenting controlling an army of 100+ bots all at once
(18:48:46) < Freeyorp> what
This thread will be updated with any further developments.

---Freeyorp

Re: [BAN] Temporary ban to protect map-server, 151 characters affected

Posted: 28 Dec 2018, 16:08
by playerone
Freeyorp101 wrote:
28 Dec 2018, 07:39
This is a technical ban rather than a personal ban, made to protect the uptime of the server. The intention is this ban will be reversed once the owner can be contacted and told to tone it down a bit, and/or sane sendq/recvq limits are implemented on the server. (Any evasion prior to contact, however, *is* potentially liable for a personal ban.)
Hello, I'm the owner of these bots.
The buffer was made sending CMSG_PLAYER_CHANGE_DEST packets (0x0085) to collision positions. As these actions doesn't have a send limit, it can increase quickly and overflow when other characters are connecting at the same map.

I did the same test on localhost, but sessions were closed due time out response and map-server got freezed
since I'm not using the Out of memory killer.

Code: Select all

socket: 39 wdata expanded to 16777216 bytes.
socket: 58 wdata expanded to 16777216 bytes.
Session #39 timed out
Player [Amdusias] has logged off your server.
Session #58 timed out
Player [Anammelech] has logged off your server.
I did the test on tmw server at night while server was almost empty (no more than 10 users connected), and don't pretend to replicate the bug again. Take any actions you consider with the bots, if people think they deserve to keep the ban it's fine for me.

Happy holidays!

Re: [BAN] Temporary ban to protect map-server, 151 characters affected

Posted: 28 Dec 2018, 19:40
by Freeyorp101
Thank you for responding so quickly.

I'm happy with this as an acknowledgment, and have lifted the IP ban. That's all, as far as my side of things are concerned, and I hope this hasn't affected much else for you in the meantime.

It seems even a little bit of real world latency is enough to make for several dozen connections wanting buffers of hundreds of megabytes each, unfortunately. I'll see what I can do to put more sensible limits in place - so that at least the stability of the server isn't threatened in these unusual cases - soon enough, with any luck.

Happy holidays!

---Freeyorp